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PREFACE 


PREFACE TO THE SECOND EDITION 

This book has now been in print for almost 10 years and has seen several printings. 
During this period, the field of quantitative finance has experienced abrupt changes, 
some for better and some for worse. But it has been very gratifying to us to have 
heard from many readers that this book has been helpful to them in dealing with the 
ever-changing financial landscape. It appears that to some extent at least the original 
objectives set out in the first edition have been realized. This book can be used either 
as an introductory text to simulations at the senior undergraduate or as a Master’s 
level course. It can also be used as a complimentary source to the more specialized 
treatise by Chan and Wong (2013) entitled Handbook of Financial Risk Management: 
Simulations and Case Studies. 

This second edition has been thoroughly revised and enhanced. Many of these 
changes were results of teaching different courses in simulation for financial risk 
managers over the years. In addition to cleaning up as many errors and misprints as 
possible, the following specific changes have been incorporated in this revision. 

• Many readers suggested more exercises with worked solutions. As a result, we 
enlarge the problems and answers section in light of these requests. 

• Because the use of VBA in Excel has been common in the financial industry, the 
current edition incorporates this suggestion. We have now replaced all S-Plus 
codes with VBA codes. 

• Due to the advent in IT technology, a new website has been set up for readers 
to download the VBA computer codes. 
http://www.sta.cuhk.edu.hk/Book/SRMS/ 
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As long as the website is available, we no longer print computer codes, so that 
more space can be used for expanded topics. 

• Likewise, suggested solutions to exercises at the end of each chapter are now 
available via online supplementary materials. 

• To make the book self-contained, two new chapters, Chapters 1 and 2, have 
been added. Chapter 1 introduces basic concepts of Excel VBA, and Chapter 2 
introduces basic concepts of derivatives. 

• Corresponding to Chapter 9 in the first edition, Chapter 1 1 of this edition is 
expanded to discuss in detail a one-factor interest rate model and the calibration 
to yield curves. 

• More examples have been added to illustrate the concept of MCMC, in partic- 
ular the Metropolis-Hastings algorithm. 

Finally, we would like to thank colleagues and students alike, who have been giv- 
ing us suggestions and ideas throughout the years. In particular, we would like to 
thank the editorial assistance of Dr. Warwick Yuen and Mr. Tom Ng of CUHK and 
Ms. Sari Friedman and Mr. Jon Gurstelle of Wiley. We also want to express our grat- 
itude to the Research Grants Council of HKSAR for support at various stages of our 
work on this revision. 


Ngai FIang Chan and FIoi Ying Wong 


Shatin, Hong Kong 
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PREFACE TO THE FIRST EDITION 

Risk management is an important subject in finance. Despite its popularity, risk man- 
agement has a broad and diverse definition that varies from individual to individual. 
One fact remains, however. Every modern risk management method comprises a 
significant amount of computations. To assess the success of a risk management 
procedure, one has to rely heavily on simulation methods. A typical example is the 
pricing and hedging of exotic options in the derivative market. These over-the-counter 
options experience very thin trading volume, and yet their nonlinear features forbid 
the use of analytical techniques. As a result, one has to rely on simulations in order 
to examine their properties. It is therefore not surprising that simulation has become 
an indispensable tool in the financial and risk management industry today. 

Although simulation as a subject has a long history by itself, the same cannot be 
said about risk management. To fully appreciate the power and usefulness of risk 
management, one has to acquire a considerable amount of background knowledge 
across several disciplines: finance, statistics, mathematics, and computer science. It 
is the synergy of various concepts across these different fields that marks the success 
of modern risk management. Although many excellent books have been written on 
the subject of simulation, none has been written from a risk management perspective. 
It is therefore timely and important to have a text that readily introduces the modern 
techniques of simulation and risk management to the financial world. 

This text aims at introducing simulation techniques for practitioners in the finan- 
cial and risk management industry at an intermediate level. The only prerequisite is 
a standard undergraduate course in probability at the level of Hogg and Tanis (2006), 
say, and some rudimentary exposure to finance. The present volume stems from a 
set of lecture notes used at the Chinese University of Hong Kong. It aims at strik- 
ing a balance between theory and applications of risk management and simulations, 
particularly along the financial sector. The book comprises three parts. 

• Part one consists of the first three chapters. After introducing the motivations 
of simulation in Chapter 1, basic ideas of Wiener processes and Ito’s calculus 
are introduced in Chapters 2 and 3. The reason for this inclusion is that many 
students have experienced difficulties in this area because they lack the under- 
standing of the theoretical underpinnings of these topics. We try to introduce 
these topics at an operational level so that readers can immediately appreciate 
the complexity and importance of stochastic calculus and its relationship with 
simulations. This will pave the way for a smooth transition to option pricing and 
Greeks in later chapters. For readers familiar with these topics, this part can be 
used as a review. 

• Chapters 4-6 comprise the second part of the book. This part constitutes the 
main core of an introductory course in risk management. It covers standard top- 
ics in a traditional course in simulation, but at a much higher and succinct level. 
Technical details are left in the references, but important ideas are explained in 
a conceptual manner. Examples are also given throughout to illustrate the use of 
these techniques in risk management. By introducing simulations this way, both 
students with strong theoretical background and students with strong practical 
motivations get excited about the subject early on. 
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• The remaining Chapters 7-10 constitute part 3 of the book. In this part, more 
advanced and exotic topics of simulations in financial engineering and risk man- 
agement are introduced. One distinctive feature in these chapters is the inclusion 
of case studies. Many of these cases have strong practical bearings such as pric- 
ing of exotic options, simulations of Greeks in hedging, and the use of Bayesian 
ideas to assess the impact of jumps. By means of these examples, it is hoped that 
readers can acquire a first-hand knowledge about the importance of simulations 
and apply them to their work. 

Throughout the book, examples from finance and risk management have been 
incorporated as much as possible. This is done throughout the text, starting at the 
early chapter that discusses VaR of Dow to pricing of basket options in a multiasset 
setting. Almost all of the examples and cases are illustrated with Splus and some with 
Visual Basics. Readers would be able to reproduce the analysis and learn about either 
Splus or Visual Basics by replicating some of the empirical work. 

Many recent developments in both simulations and risk management, such as 
Gibbs sampling, the use of heavy-tailed distributions in VaR calculation, and princi- 
pal components in multiasset settings are discussed and illustrated in detail. Although 
many of these developments have found applications in the academic literature, they 
are less understood among practitioners. Inclusion of these topics narrows the gap 
between academic developments and practical applications. 

In summary, this text fills a vacuum in the market of simulations and risk manage- 
ment. By giving both conceptual and practical illustrations, this text not only provides 
an efficient vehicle for practitioners to apply simulation techniques, but also demon- 
strates a synergy of these techniques. The examples and discussions in later chapters 
make recent developments in simulations and risk management more accessible to a 
larger audience. 

Several versions of these lecture notes have been used in a simulation course given 
at the Chinese University of Hong Kong. We are grateful for many suggestions, com- 
ments, and questions from both students and colleagues. In particular, the first author 
is indebted to Professor John Lehoczky at Carnegie Mellon University, from whom 
he learned the essence of simulations in computational finance. Part 2 of this book 
reflects many of the ideas of John and is a reminiscence of his lecture notes at Carnegie 
Mellon. We would also like to thank Yu-Fung Lam and Ka-Yung Lau for their help in 
carrying out some of the computational tasks in the examples and for producing the 
figures in LaTeX, and to Mr. Steve Quigley and Ms. Susanne Steitz, both from Wiley, 
for their patience and professional assistance in guiding the preparation and produc- 
tion of this book. Financial support from the Research Grant Council of Hong Kong 
throughout this project is gratefully acknowledged. Last, but not least, we would like 
to thank our families for their understanding and encouragement in writing this book. 
Any remaining errors are, of course, our sole responsibility. 

Ngai Hang Chan and Hoi Ying Wong 
Shcitin, Hong Kong 
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PRELIMINARIES OF VBA 


1.1 INTRODUCTION 

This chapter introduces the elementary programming skills in Visual Basic for Appli- 
cations (VBA) that we use for numerical computation of the examples in the book. 
Experienced readers can read this chapter as a quick review. 


1.2 BASIS EXCEL VBA 

Microsoft Excel is widely used in the financial industry for performing financial 
calculations. VBA is a common programming language linked to Excel and other 
Microsoft Office software that was developed to automatically control and perform 
repetitive actions. In this section, we guide readers on how to start a VBA in Microsoft 
Excel and give some popular algorithms for performing repetitions. In most cases, 
simple algorithms will be sufficient to perform the computations in the examples and 
exercises. We provide the illustrations in Excel 2010, although other versions can be 
set up in a similar way. For a comprehensive reference, readers are referred to other 
books. 
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PRELIMINARIES OF VBA 


1.2.1 Developer Mode and Security Level 

For first-time users of VBA in Excel, it is more convenient to switch on the devel- 
oper mode, where many of the VBA functions can be easily accessed. To open the 
developer mode, follow the following steps: 


Click [File] -> [Options] (Fig. 1.1) 
[Developer]. 


[Customize Ribbon] (Fig. 1.2) 


Figure 1.3 shows the ribbons at the top of Excel after switching on the developer 
mode. Macros refer to the codes executed in the VBA language. To execute the macros 
promptly, users are recommended to turn down the security level as follows: 

Click [Macro Security] (Fig. 1.3) — > Macro Settings [Enable all macros] (Fig. 1.4). 


1.2.2 Visual Basic Editor 

To edit the VBA codes, Microsoft provides a Visual Basic editor (VBE) in Excel 
for editing the macros. Macros are created, edited, and debugged in the VBE before 
being executed. A macro is usually created as a Sub or Function procedure that can 
perform automatic tasks, while a module consists of one or more macros. Similarly, 
a project has one or more modules. Sub and Function are reserved keywords in VBA. 
Users need to avoid using keywords when defining new variables. The codes in the 
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Figure 1.1 Excel [Options]. 
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Figure 1.2 Developer mode selection. 
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Figure 1.3 Excel in developer mode. 


VBE are saved together with the Excel worksheet. In Excel 2010, these worksheets 
can be saved as .xlsm as an Excel Macro-Enabled Workbook file. To open and edit 
macros in VBE, follow the following procedure: 

1 . Open VBE: click the [Visual Basic] button under the developer mode (Fig. 1.3) 
or press ALT+F11. 

2. Insert module: in the project window of the VBE, right-click on one of the 
worksheets, and select [Insert] — > [Module] (Fig. 1.5). 

3. Edit in VBE: type the codes in the panel on the right. 
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Figure 1.5 Visual basic editor. 
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4. Execute the program: in the VBE, select the module, and click the “Play” ribbon 
or press F5. In the Excel worksheet (Fig. 1 .3), click the [Macros] button, choose 
the macro to be run, and click [Run]. A command button for a specific macro 
can be inserted in the Excel worksheet to facilitate the execution. See Section 
1.2.4 for details. 

1.2.3 The Macro Recorder 

The macro recorder can record the actions that you perform in the Excel worksheet, 
such as building a chart or typing words, and transfer the actions into the macros in the 
VBE. This will be useful when you do not know how to code the actions and need to 
repeat them later. However, the macro recorder cannot handle codes that involve using 
the For loop or other repetitive loops and assigning variables. Different environments 
in Excel may generate different codes for the same task. Nevertheless, it can be a 
handy tool for learning new VBA codes. To record a macro, do the following: 

1. Open the macro recorder: in the developer mode (Fig. 1.3), click [Record 
Macro]. 

2. Type the name to be used for the macro and a description of it so that you can 
recognize the macro next time (Fig. 1.6), then click [OK], Note that the name 
should begin with a letter and contain no spaces or special characters. 

3. Perform the tasks to be recorded; for example, type “Hello World” in cell Al. 

4. Stop the macro recorder: click the [Stop Recording] button. 

5. Go to the VBE to see the codes generated by the computer (Fig. 1.7). 
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Figure 1.6 The macro recorder. 
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Figure 1.7 The recorded codes. 


The recorder creates a new Sub module in the VBE (Fig. 1.7). To run this macro, 
just click [Macro] at the top ribbon in Excel and select the macro you want to run. 
In the recorded codes, the words following the symbol ' are not executed and serve 
only as comments. These comments are added to the codes to increase the readability 
for other users. It is a good programming habit to provide comments inside the codes 
to explain the details of the algorithm or define the variables. Comments can also be 
added by putting the keyword Rem at the beginning of the line. 


1.2.4 Setting Up a Command Button 

To run a specific macro in the Excel spreadsheet without selecting the macro proce- 
dure list, it is more convenient to designate a command button for each frequently 
used macro. To run the macro, the user just needs to press the command button. To 
insert a command button, follow the following procedure: 

1 . Click the [Insert] icon in the developer mode ribbon, and click the Command 
Button under [Form Controls] (Fig. 1.8). 

2. Drag the mouse over a rectangle in the spreadsheet and release, then select the 
macro for the button. 

3. To edit the button, left-click the name of the command button to change the 
name. Right-click the command button and select [Assign Macros] (Fig. 1.9) 
to change the macro. 

4. Click on the command button to run the macro. 

With this command button, users can quickly execute a macro. 
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1.3 VBA PROGRAMMING FUNDAMENTALS 

1.3.1 Declaration of Variables 

A variable in programming is the name of a place in the computer’s memory where 
some values or objects can be stored. To declare a variable in VBA, we use the 
following statement: 

Dim varname [As vartype] , 

where varname is the variable name and vartype is the variable type. A variable 
name must begin with a letter and contain only numeric and letter characters and 
underscores. The name should not be the same as a VBA reserved word, such as Sub, 
Function, End, For, Optional, New, Next, Nothing, Integer, and String. However, VBA 
does not distinguish between cases. 

For the [Ay vartype] part, it is optional to specify the type of variable. This is differ- 
ent from other programming languages, which require the programmer to explicitly 
define the data type of each variable used. However, if you do not specify the data 
type explicitly, VBA will be slower to execute and use memory less efficiently. 

1.3.2 Types of Variables 

Every variable can be classified into one of four basic types: string data, date data, 
numeric data, and variant data. The string data type is used to store a sequence of 
characters. The date data type can hold dates and times separately and simultane- 
ously. The types used most frequently in this book are numeric data and variant 
data. 

There are several numeric data types in VBA, and the details of each type are 
listed in Table 1.1. In general, it is more efficient to use the data type that uses the 
smallest number of bytes. This can significantly reduce the computational time for 
simulations. 

The variant data type is the most flexible because it can store both numeric and 
non-numeric values. VBA will try to convert a variant variable to the data type that 
can hold the input data. Defining |Av vartype ] is optional part, so an undeclared type 
of variable will be stored as Variant by default. 

A variant type variable can also hold three special types of value: error code, Empty 
(indicating that the variable is empty and is not equal to 0, False, an empty string, or 
other value), and Null (the variable has not been assigned to memory and is not equal 
to 0, False, an empty string, Empty, or other value). 

The following codes show some examples of variable declaration statements: 


Dim x As integer 
Dim z As string 
z = "This is a string" 

Dim Today As Date 

Today = #1/9/2014# 'defined using month/day/year format 
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TABLE 1.1 Numeric Data Type 


Type 

Short Hand 

Range 

Description 

Byte 


0 to 255 

Unsigned, integer number 

Boolean 


True(- 1 ) or False(O) 

Truth value 

Integer 

% 

-32, 768 to 32,767 

Signed integer number 

Long 

& 

-2, 147,483,648 to 
2,147,483,647 

Signed integer number 

Single 

| 

±3.402823E38 to 
+ 1.401298E— 45 

Signed single-precision 
floating-point number 

Double 

# 

± 1 .797693 1 348623 1E308 to 
±4.94065645841247E-324 

Signed double-precision 
floating-point number 

Decimal 


±7.9228 1925 1426433759E28 
with no decimal point and 
±7.9228 1625 1426433759354 
with 28 digits behind the 
decimal point 

Cannot be directly declared 
in VBA; requires the use of 
a Variant data type 


1.3.3 Declaration of Multivariable 

We use the following statement to declare several variables: 


Dim x As Integer, y As Integer, z As Integer 


However, the declaration that 

Dim x, y, z As Integer 

denotes z as the Integer type only, while x and y are declared as variant types. We can 
use shorthand (Table 1.1) to improve the cleanness and readability of the program: 

Dim x#, y#, z As Double 


1.3.4 Declaration of Constants 

Constants are declared in a Const Statement as follows: 

Const interest_rate as Double = 0.02 

Const dividend_yield = 0.02 'without declaring the constant type 
Const option_type as String = "Put" 

1.3.5 Operators 

This section introduces the assignment operator, mathematical operators, compara- 
tive operators, and logical operators. The equal sign (=) is an assignment operator that 
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TABLE 1.2 VBA Logical Operators 

Operator Uses 

Not Performs a logical negation on an expression 

And Performs a logical conjunction on two expressions 

Or Performs a logical disjunction on two expressions 

Xor Performs a logical exclusion on two expressions 

Eqv Performs a logical equivalence on two expressions 

Imp Performs a logical implication on two expressions 


is used to assign the value of an expression to a variable or constant. An expression 
is a combination of keywords, operators, variables, and constants that yields a string, 
number, or object. 

For example, 

y = 3 * 2 
y = y * 6 

Then y is evaluated as 36. 

Other common mathematical operators include addition (+), multiplication (*), 
division (/), subtraction (— ), and exponentiation (''). 

VBA also supports the same comparative operators used in Excel formulas: equal 
to (=), greater than (>), less than (<), greater than or equal to (>=), less than or equal 
to (<=), and not equal to (<>). 

Table 1 .2 lists the logical operators and their functions in VBA. 

1.3.6 User-Defined Data Types 

VBA provides the Type statement to allow users to create a more complex custom 
data type or user-defined data types (UDTs). The syntax for creating a UDT is as 
follows: 


[Private | Public] Type typename 
[element_name As vartype] 
[element_name As vartype] 

End Type 


[ Private\Public ]: (optional) this is Public by default. If it is declared as Private , 
the UDT can only be declared in the same module as the UDT. 
typename: (required) this is the name of the UDT, and it follows the standard vari- 
able naming conventions. 

element _name: (required) this is the name of the elements within a UDT, and it 
follows the standard variable naming conventions. 
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vartype: (required) unlike declaring ordinary variables, the elements within a UDT 
must be assigned a data type, which can be any variable type (including Variant) 
or a UDT. 

UDT can be defined at the top of the module before any procedures. To refer to the 
subelements within the UDT, use the period (.) operator. See the following example 
for illustration. 

Example 1.1 The following code defines a nested UDT, which stores the name and 
coordinates of a point. 

Type Coordinate 
x As Double 
y As Double 
End Type 

Type Point 

name As String 
z As Coordinate 
End Type 


Sub UDTExl ( ) 

'Declare pi as UDT Point 
Dim pi as Point 

'Assigning the values 
pi. name = "A" 
pi . z .x = 3.5 
pi . z .y = 3.1 

' Print out the values to spreadsheet 
Cells (1, 1) = pi. name 
Cells(2, 1) = pl.z.x 
Cells (3, 1) = pl.z.y 
End Sub 


1.3.7 Arrays and Matrices 

An array is a collection of variables of the same type that have a common name. The 
index numbering makes it easy for users to perform looping in repetitive tasks. 

The following statement declares a one-dimensional (ID) array: 


Dim varname (Lowerlndex to Upperlndex) As vartype. 


In this way, users can access the variables with varaawe(Lowerlndex), var- 
name( Lowerlndex +1), ..., va m ame(Upperlndex) . 

If only the upper index is specified, that is. 


Dim varname (Upperlndex) As vartype , 
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VBA will assume that 0 is the lower index. 

A multidimensional array can be declared as: 


Dim varname (Lower Indexl to Upperlndexl, LowerIndex2 to _ 
Upper Index2 LowerlndexN to Upper IndexN) As vartype . 


Note that both the lower index and the upper index must be a constant or a number. 
A dynamic array should be used for the variable index, which does not have a preset 
number of elements. The following statement declares a dynamic array. 


Dim varname ( ) As vartype 


Before the dynamic array is used, a ReDim statement should be inserted to specify 
the number of elements in the array. For example. 


ReDim varname (Lower Index to Upperlndex) 


To declare a matrix of size m X n containing real numbers, use the following state- 
ment. 


Dim matrixmn ( ) As Double 
ReDim matrixmn (1 To m, 1 To n) 


1.3.8 Data Input and Output 

One advantage of using Excel VBA is that it can link the VBE and worksheet so 
that users can read in and print out data in the worksheet and execute the programs 
written in VBE. The following statements are usually used for input and output data, 
respectively. 


' Read in data 
Var = Cells (i , j ) 

' Print out data 
Cells (i, j ) = Var , 

where i and j denote the row number and the column number of a cell, respectively. 

1.3.9 Conditional Statements 

Conditional statements allow users to perform different tasks subject to differ- 
ent conditions. The two main conditional statements in VBA are If-then-else 
and Select-Case statements. There are two forms of the If-then-else statement: 
single-lined and multi-lined. Only one statement is allowed in the single-lined form, 
whereas many statements can be inserted in the multi-lined form. The syntax of the 
If-then-else statements is as follows: 
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' the Else clause is optional 

If [condition] Then [statement] (Else [elseStatement] ) 
' . . . represents other more statements can be included 
'these Else clauses are also optional 
If [condition] Then 
[statement] 

Elself [elseif conditionl] Then 
[Statement] 

Elself [elseif condition2] Then 
[Statement] 


Else 

[Statement] 
End If 


In the conditional part of the statement, the user needs to specify an expression that 
can be evaluated as True or False. The comparative operators and logical operators 
in Table 1.2 can help to express more complex conditions. 

The Select-Case statement is useful for choosing among three or more options and 
is a good alternative to the If-Then-Else statement. The syntax for Select-Case is as 
follows: 


Select Case [testexpression] 
Case expressionlist-n 
[ instructions -n] 

Case expressionlist-n 
[ instructions -n] 

Case Else 

[def ault_inst ructions] 
End Select 


1.3.10 Loops 

The use of the loops algorithm allows users to perform certain tasks several times. 
For-Next loops and Do loops are widely used in VBA programming. In particular, 
For-Next loops are frequently used in simulations. The syntax for a For-Next loop is 
as follows: 


For counter = startValue To endValue [Step nStep] 
[statements] 

[Exit For] 

[statements] 


Next counter 
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If the Step nStep part is omitted, the counter will increase by 1 each time. We can set 
nStep to be n, and the counter will then increase by n each time. 

For a Do Loop, the syntax is as follows: 

Do [do_condition] 

[statements] 

[Exit Do] 

[statements] 

Loop [loop_condition] 


Although both the do_condition and the loop_condition are optional, only one of them 
can be used for a Do Loop. If both are omitted, then the user must specify a condition 
and call Exit Do to end the loop. Otherwise, the program will not terminate. The 
syntax is the same for do_condition and loop_condition. 

While | Until condition 

For While, the loop will continue as long as condition is True. For Until, the loop 
breaks once condition becomes True. Whether to use While or Until depends solely 
on the programmer’s preference, as the same task can be performed by either loop. 
However, whether to put the condition after Do or Loop depends on the situation, 
because if it is put after Loop, then the loop is repeated at least once. The following 
example illustrates the uses of different loops to perform the same task. 

Example 1.2 Use five different methods to print out 1 to 10 to cells A1 to A10. 

' For Loop 
For i = 1 to 10 

Cells(i, 1) = i 
Next i 

' Do Loop Method 1 
i = 1 

Do while i <= 10 
Cells(i / 1) = i 
i = i + 1 
Loop 

' Do Loop Method 2 
i = 1 

Do Until i > 10 

Cells(i, 1) = i 
i = i + 1 

Loop 

' Do Loop Method 3 
i = 1 
Do 

Cells (i , 1) = i 
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i = i + 1 

Loop while i <= 10 

' Do Loop Method 4 
i = 1 
Do 

Cells (i , 1) = i 
i = i + 1 
Loop until i > 10 


1.3.11 Sub Procedures and Function Procedures 

Large programs often need to be divided into smaller pieces for easier management 
and maintenance. In VBA, a procedure is basically a set of computer codes that 
performs certain tasks. There are two types of procedures: a Sub procedure and a 
Function procedure. A Sub procedure performs tasks but does not return values, while 
a Function procedure returns a value at the end of the procedure. 

The syntax that defines a Sub procedure is as follows: 


[Private [ Public] [Static] Sub name ( [arglist] ) 
[statements] 

End Sub 


Private\Public : (optional) the Sub is Public by default if Public or Private is omit- 
ted. Public indicates that the Sub is accessible by other Subs or Functions in 
all modules. Private indicates that the Sub is accessible only to the Subs and 
Functions in the same modules. 

Static: (optional) static indicates that all local variables in the Sub are preserved 
at the end of the Sub. If Static is omitted, the values of the local variables will 
be reset each time the Sub ends. 

name : (required) this is the identifier of the Sub. It follows the standard variable 
naming conventions and must be unique and cannot be the same as the identifier 
of other Subs, Functions, classes etc. 

arglist: (optional) this is a list of variables representing the parameters that are 
passed to the sub when it is called. Multiple variables are separated by commas. 
If the procedure uses no arguments, a set of empty parentheses is required. 

statements: (optional) this refers to any group of statements to be executed within 
the Sub. 

Example 1.3 The following procedure, SubEx2, calculates varl +var2 and outputs 
the result in cell Al: 


Sub SubEx2(varl, var2) 

Cells (1, 1) = varl + var2 
End Sub 
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To call the Sub, use one of the following two statements where x,y can also be 
replaced by other constants or variables. 

Call SubEx2(x, y) 

SubEx2 x, y 

Instead of just specifying the name of the parameters, each parameter in arglist 
can be specified with the following syntax: 

[Optional] [ByRef | ByVal] varname [As vartype] [= def aultvalue] 

Optional, (optional) this indicates that this parameter is optional and will take the 
defaultvalue as its value if it is omitted when the Sub is called. 

ByRef \ByVal: (optional) the parameter is passed ByRef by default. ByRef and 
ByVal indicate whether the parameter is passed by value or by address. When 
calling with ByRef, the memory address of the parameter is passed to the pro- 
cedure and any change in the parameter value in the procedure will change the 
original parameter. For ByVal, a copy of the value of the parameter is passed so 
the original parameter will not be affected. 
varname: (required) this is the identifier of the parameters. 
vartype: (optional) the variable type is Variant by default. It is the variable type 
of the parameter passed, which can be any of the variable types or a UDT. 
If the variable that is passed when calling the Sub does not match, an error 
“ByRef/By Val argument type mismatch” is shown. 
defaultvalue: (optional) this is the value that the parameter will take when the 
parameter is not specified and the Sub is called. 

Example 1.4 The following codes demonstrate the difference between ByRef and 
ByVal: 


Sub SubEx3_Run ( ) 

Dim x as integer, y as integer 
x = 1 

y = i 

Call SubEx3(x, y) 

Cellsd, 1) = x 
Cells(2, 1) = y 
End Sub 

Sub SubEx3 (ByRef varl as integer, ByVal var2 as integer) 
varl = varl + 1 
var2 = var2 + 1 
End Sub 


Cell A1 shows that 2, as the change in the value of varl in SubEx3, actually 
changes the value of x. Cell A2 shows that 1 , as the change of the value of var2 
in SubEx3, does not affect the value of y. 
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VBA also allows the user to create a Sub that takes an arbitrary number of parame- 
ters using ParamArray. When using ParamArray, the parameters can be passed only 
by reference and declared as the Variant type. They are stored in an array with the 
parameter’s name. To declare such a Sub, use 


Sub SubEx4 ( ParamArray var ( ) ) 
[statements] 

End Sub 


Unlike a Sub module, a Function can be used in an Excel spreadsheet as a 
user-defined function. The syntax that defines a Function is as follows: 

[Private [ Public] [Static] Function name ( [arglist, ...]) [as vartype] 
[statements] 

End Sub 

For Private\Public, Static, name, and arglist, a Function is identical to a Sub. The 
only difference between the declaration of Function and Sub is that when defining 
Function, the user may want to define the return type vartype. Otherwise, the return 
type is Variant by default. To return a value for a Function, the user just needs to store 
that value in a variable with the same name as the function name. To call a Function, 
use one of the following statements: 

Call FuncName(x, y) 

FuncName x, y 
z = FuncName (x, y) 

Note that the first two statements are identical to those used for Sub, so one can treat 
Function as Sub if the return value does not matter. For the third statement, the return 
value will be stored in z. 

As Sub cannot return a value, to accomplish certain tasks, it may be necessary to 
use global variables or pass the variables by reference. Example 1.5 calculates varl 
+ var2 and outputs the result into cell Al, which is analogous to Example 1.3 using 
Function. 

Example 1.5 The following code calculates 3 + 4 by calling Function FuncExA and 
outputs the sum of the two numbers, 5, into cell Al. 


Sub SubEx4 ( ) 

Cells (1, 1) = FuncEx4(3 / 4) 

End Sub 

Function FuncEx4 (varl as integer, var2 as integer) as integer 
FuncEx4 = varl + var2 
End Function 
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TABLE 1.3 Common Built-In Math Functions in VBA 


Function 

Return Value 

Math Expression 

Abs(x) 

Absolute value of the x 

\x\ 

Atn(x) 

Arc-tangent of x in radians 

tan -1 x 

Cos(x) 

Cosine of x 

COS* 

Exp(x) 

Exponential of x 

e* 

Int(jc) 

The integral part of x 

M 

Log(x) 

Natural logarithm of x, 

lnx 

Round(x[, dp]) 

x rounded to dp decimal place 
dp is 0 by default if omitted 


Sgn(x) 

Number indicates the sign of x 
- 1 for x < 0, 0 for x = 0, 1 for x > 0 

Wx 

Sin(x) 

Sine of x 

sinx 

Sqr(x) 

Square root of x 


Tan(x) 

Tangent of x 

tanx 


1.3.12 VBA’s Built-In Functions 

VBA has a variety of built-in functions that can simplify calculations and operations. 
For a complete list of functions, please refer to the VBA Help System. In the VBE, 
you can type “VBA” to display a list of VBA functions. Table 1.3 shows some of the 
commonly used built-in mathematical functions and their return values in descriptive 
and mathematical forms. 

Remarks: If the input number is negative, then the function Int returns the first 
negative integer that is less than or equal to the number and the Fix function returns 
the first negative integer greater than or equal to the number. For example, Int(— 8.3) 
returns —9, whereas Fix{— 8.3) gives —8. 

Excel VBA also allows users to use Excel worksheet functions such as Average , 
Stdev. To call the worksheet functions, use one of the following commands: 


Application . FunctionName ( [arglist] ) 

WorksheetFunction. FunctionName ( [arglist] ) 

Application. WorksheetFunction. FunctionName ( [arglist] ) 

For example, to calculate sin -1 0.5, which is not provided in VBA’s built-in function 
library but is included in Excel, one can use 


x = Application .Asin ( 0 . 5) . 


This returns the value 0.5236 (» n/ 6) and is stored in.*. Note that not all Excel work- 
sheet functions can be used in VBA. In particular, worksheet functions that have an 
equivalent VBA function, such as sqrt and sin, cannot be used. For a complete list of 
Excel worksheet functions, please refer to the Excel help pages. 
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BASIC PROPERTIES OF FUTURES 
AND OPTIONS 


2.1 INTRODUCTION 

A financial derivative is a security whose value depends on the values of other more 
elementary securities, such as equities, bonds, and commodities. Forward contracts 
and futures contracts are two typical derivatives trading in the financial market. The 
primary use of forward and futures contracts is to hedge against portfolio risk, but they 
also offer speculative opportunities to investors. Before introducing the properties of 
these contracts, we present some fundamental concepts in derivative pricing. 


2.1.1 Arbitrage and Hedging 

An arbitrage opportunity is a situation whereby an investor is able to enter into a trade, 
usually involving two or more markets, in which he/she can lock in a position with a 
positive probability of profit and a zero probability of loss. An arbitrage opportunity 
usually lasts for a very short time in an efficient market. In pricing derivatives, we 
want to make sure that the fair prices of the derivatives will not lead to any arbitrage 
opportunities. 

As mentioned, forwards and futures are used to hedge against risk, which means 
they can be used to transfer the risk of unfavorable price fluctuations to other market 
participants. For example, assume that you are holding a share of a stock currently 
worth $45, and you have a deal with a counterparty that you will exchange that share 
with him for $50 one month later. One month later, you are sure to get $50 if your 
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counterparty honors the deal. In this way, you hedge the market risk of the stock price 
for a fixed return. The existence of derivatives markets facilitates hedging and also 
possible speculation with large leveraging. 

Another important concept is risk-neutral pricing, which states that the price of 
derivatives determined as “risk-neutral” totally agrees with the price obtained in the 
real world. In the risk-neutral world, every security generates the same expected 
rate of return, which is the risk-free interest rate. An investor can only earn exces- 
sive returns because of “pure luck.” Modern derivative pricing theory argues that no 
arbitrage is associated with the existence of a risk-neutral world for the valuation of 
derivatives. 


2.1.2 Forward Contracts 

A forward contract is usually an over-the-counter (OTC) agreement between the 
buyer and the seller, whereby the buyer agrees to buy an asset (long position) from 
the seller (short position) at a certain future time (maturity) for a prespecified price 
(delivery price). The contract is usually traded between two financial institutions or 
between a financial institution and one of its corporate clients, but it is not traded on 
an exchange. 

At the time of initiation of the contract, the delivery price is chosen so that the 
value of holding the forward contract is zero for both parties. At maturity, the holder 
of the short position delivers the asset to the holder of the long position in return for 
a cash amount equal to the delivery price. At the time the contract is entered into, the 
delivery price equals the forward price. As time passes, the delivery price is fixed, but 
the new forward price for the same underlying asset with the same maturity changes 
from time to time. These forward prices make the contract zero value at each time 
point. Therefore, the forward price generally does not equal the delivery price except 
at the beginning of the contract. 

In the following, we determine the fair price of a forward contract. Let S t be the 
price of the underlying asset at current time t, K be the delivery price, T be the matu- 
rity time of the contract, F r be the forward price at time t,f t be the value of the forward 
contract at time t, and r be the continuously compounded risk-free interest rate, which 
is assumed to be a constant. For simplicity, we assume there is no transaction cost in 
the market, the borrowing and lending rate are the same, and the trading profits have 
the same tax rate. At the initial time t = 0, the forward price equals the delivery price: 

F 0 = K and f 0 = 0. 

For a continuously compounding interest rate r , a zero-coupon bond paying $1 at 
future time T is worth e _r(T_,) at time t < T. To determine the forward price, we 
construct two portfolios with the same payoff at maturity T under all scenarios. Then, 
these two portfolios should have the same price at current time t. This concept is 
referred to as the law of one price. No arbitrage implies that the prices of the two 
portfolios must be the same. We consider two cases of the underlying asset. 
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1. No intermediate income from the underlying asset. 

This kind of asset includes non-dividend-paying stocks and zero coupon bonds. 
Consider two portfolios at time t: 

A. Long a forward contract with delivery price K and invest Ke~' ir ~ l) in the 
bank for a risk-free interest rate r. 

B. Long one unit of the underlying asset S t . 

Both portfolios will pay the holder one unit of the asset at time T ; therefore, 
their current prices should be the same. Otherwise, investors can always long 
the cheaper portfolio and short the other one to gain a risk-less profit at maturity 
T. Hence, if there is no arbitrage, then 

ft + Ke - r(r - ;) = 5,. 

The forward price F t is the delivery price such that the forward contract has 
zero value at current time t. Therefore, we have 

0 + F t e~ riT ~ ,] = S t , 

=> F, = S t e r{T ~ l) . (2.1) 

Example 2.1 Consider a 6-month forward contract on a stock worth $13.50 per 
share at maturity. Assuming the current stock price is $12.00, the risk-free rate is 
5.25%, and there is no dividend in the next 6 months. The forward price can be deter- 
mined as 


F 0 = i2 e 00525 (°- 5 - 0 ) 

= 12.32. 

If the forward price is cheaper than the delivery price, it is possible to obtain arbitrage 
by shorting the forward contract and borrowing $12 from the bank at a rate of 5.25% 
to buy one share of the stock now. The investor does not need to put any money in 
this portfolio at its initiation, and this is called a self-financing portfolio. Six months 
later, the investor can deliver the share of stock for $13.50 and pay back $12.32 to 
the bank. He will be sure to earn $13.50-$12.32=$1.18 after 6 months. 

2. With a known cash income. 

Let I be the present value of the income to be received from the underlying asset 
during the life of the forward. Again, we construct two portfolios as follows: 

C. Long a forward contract with delivery price K and invest 
Ke~ r(1 in the bank for a risk-free interest rate r. 

D. Long one unit of the underlying asset S t and borrow an amount, /, from the 
bank at the risk-free interest rate r. 
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In portfolio D, the cash income 1 from holding the stock is paid back to the 
bank. These two portfolios give both holders a share of the stock at maturity; 
thus their current prices must be the same to avoid arbitrage. We have 

f t + Ke-^ = S t -I. 

The forward price is the delivery price that ensures the forward has a zero value 
at the time of initiation. Therefore, 


0 + F r e- r(T ~ r) = S t — I, 

^ F t = (S t - (2.2) 


Example 2.2 Consider a 6-month forward contract on a stock worth $11.50 per 
share at maturity. Assuming the current stock price is $13.00 and the risk-free rate 
for 6-month maturity is 5.25%, there will be a $1.20 dividend to be paid 3 months 
from now. The 3-month interest rate is 5.1%. The forward price can be determined as 

Fq _ ( 13 _ 1 2 e — 0 05 1 (°- 25 — °) \ ^ 0 . 0525 ( 0 . 5 — 0 ) 

= 12.13. 

If the investor longs one unit of the forward contract, shorts one share of the stock, 
and invests $(13 — 1.2e _0051(0 ' 25 ') i n the bank at a rate of 5.25% for 6 months and 
$1.2e _0 051<0 ' 25 ^ at a rate of 5.1% for 3 months, he will have a risk-less profit after 6 
months. This is a self-finance strategy. After 3 months, the 3-month deposit of $1.20 
is paid out as a dividend. At maturity, the investor can have one share of stock for 
$11.50 and re-pay the loaned stock. Therefore, he will gain $12.13-$11.50=$0.63 
without any risk. 


If the dividend is paid out continuously at an annual rate q , then q is called 
the dividend yield and the forward price can be determined using similar argu- 
ments. Specifically, we keep the portfolio A and revise the portfolio B to long 
e -q(T-t) un j ts 0 f t | le s tock, so that we will have exactly one share of the stock 
at maturity. The present value of holding the forward contract is 


f = S,e- q(T - r) - Ke- r{r ~ l) . 


The forward price is given such that the value of the forward equals zero, so 
we have 


F t = S t e (r - q)(T -'\ 


(2-3) 
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Example 2.3 Consider a 6-month forward contract on a stock worth $14 per share 
at maturity. Assuming the current stock price is $13.40 and the risk-free rate for 
6-month maturity is 5.2%, the dividend yield is 2%. The forward price can be deter- 
mined as 


F q = 13 . 4e (0.052-0.02)(0.5-0) 

= 13.62. 

If the investor shorts one unit of the forward contract, borrows 
$(13<?~°' 02 ( a5) ) from the bank at a rate of 5.2% to buy e -°- 02 ( 0 - 5 ) shares of stock with 
the dividends being reinvested in the stock after 6 months, he will have a risk-less 
profit of $ (14 - I3.4 e (0.052-0.02)(0.5)j = # 0 .38. 

The value of the forward contract changes with time. For example, assume that you 
are holding a 6-month forward contract with a delivery price of $10 for a share of the 
stock. However, after 3 months, suppose the delivery price of a new 3-month forward 
contract is $12, then your original forward contract is now worth (12 — iO)e~ho.5-o.25) 
dollars because you can short a new contract for $12 and your position will be closed 
out 3 months later. In general, the value of a forward contract is given by the formula 

/, = (F,-F 0 ) e -'' (7 '- ?) . (2.4) 


2.1.3 Futures Contracts 

A futures contract is an agreement between two parties to buy or sell an asset at a 
certain time at the future price. Unlike forward contracts, futures are normally traded 
on an exchange, and this can eliminate the default risk of the counterparty. However, 
the values of futures contracts are also marked-to-market, meaning that the values are 
determined each day according to the market price. Therefore, investors in futures can 
be subject to a margin call. 

The exact delivery date of futures is not usually specified, in contrast to forward 
contracts. A futures contract is referred to by its delivery month, and the exchange 
center specifies the period during the month when the delivery must be made. Nowa- 
days, a lot of futures are settled by cash instead of actual delivery of the assets. When 
the interest rate is constant (even deterministic), the theoretical prices of forward and 
futures contracts with the same delivery date are the same. To show this, we denote 
F t as the futures price and F t as the forward price. Now consider two different trading 
strategies with futures and forward contracts, respectively, as follows: 

Strategy A. Long e r units of futures on day 1, close out the futures on day 2 and 
long e 1 ' units of futures on day 2, close out on day 3, and so on, according to 
Table 2.1, and invest F 0 in a risk-free asset. 

Strategy B. Long e rT units of forward contracts with the forward price F 0 and 
invest F 0 in a risk-free asset. 
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TABLE 2.1 Strategy A for Longing Futures Contracts 


Day 

0 1 

2 

... T- 1 

T 

Future price 


f 2 

Ft- i 

F t 

Future positions 

e r e 2 '' 

e 3r 

e Tr 

0 

Gain/loss 

0 (F, - F 0 )e T 

(F 2 - FJe 1 ’- 

(F r 

- F 7 ._ 1 )e rr 

Gain/loss 

at time T 

h, 

1 

c 

o 

ip, - F ,)e rT 

(F t 

-Ft- Je rT 

Total gain/loss 
at time T 

0 + (F l — F 0 )e rT + (F 
= (F t - F 0 )e rT = (S T 

l -F 1 )e rT +- 
- F 0 V T 

+ (F t - F^e* 



At maturity date T. the payoff of strategy A is (S T — F 0 )e rT + F 0 e rT = S r e rT , 
whereas the payoff of strategy B is ( S T — F 0 )e rT + F 0 e rl = S T e rT . According to 
the no-arbitrage argument, these two strategies should yield the same value at any 
moment before time T. Therefore, we have 

F 0 = F 0 . (2.5) 

The variance of a portfolio can represent the risk level it exposes the investor to. 
Suppose that you need to sell N A units of stock in the future, at time t, how many units 
of futures should you short now so that the variance of your portfolio is minimized? 
To answer this question, let N F be the units of futures you should short, with a hedge 
ratio of h = N F /N A . This is called a static hedge, as the hedge is carried out only once 
at time 0 and will not need to be adjusted later. In contrast, a dynamic hedge requires 
continuous re-balancing of the portfolio weights. More examples of dynamic hedging 
are introduced in later chapters. The payoff Y t of the futures portfolio at maturity is 
given by 


Y t = N A S t -N F (F t -F 0 )e-«-» 

= N A S t -N F (F,-F 0 ) 

= N A S 0 -N A (S t -S Q )-N F (F t -F 0 ) 

= N a S 0 - N a AS, - N f AF, 

= N A S 0 -N A (AS t -hAF t ). 

Let a 2 be the variance of the stock price, a 2 be the variance of the futures price, and p 
be the correlation of the stock price and the futures price. The variance of the portfolio 
can be evaluated as: 


Var(T r ) = N\ Var(A S t - hAF,) 

= N A (aj + h 2 t t 2 - 2hpa s a F ). 
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To minimize the variance with respect to the choice of h, we take the derivative of 
Var( Y t ) with respect to h and set it to zero as follows: 

cfVar(K) , , 

— — — = N 2 A {2ho 2 F - 2pc s (J F ) = 0 

=> h* = p — . (2.6) 

o F 

From the minimum-variance hedge ratio formula, we can also derive the hedging 
effectiveness as follows: 

Var(unhedged port.) - Var(hedged port.) Var(A l A S t ) — Var( Y t ) 

Var(unhedged port.) Var(N A S t ) 

<y 2 s - (i - p 2 W s 
? 

= p 2 . (2.7) 

Example 2.4 Suppose we have a set of data on the stock and futures prices, as 
shown in Table 2.2, we can calculate the optimum hedge ratio as 


h* = (0.0928 X 0.00262)/0.00313 
= 0.786 

If we want to sell, say, 50,000 units of the stock at time t, we can calculate N F as 

N* f = h* X N a 

= 0.786x50, 000 
= 39, 300. 

Therefore, if there are 1,000 units per futures contract, the portfolio will have the 
minimum variance if we short approximately 39 futures contracts. 


TABLE 2.2 Data on Stock and Futures Prices 


Month 

A F 

AS 

1 

0.021 

0.029 

2 

0.035 

0.020 

15 

-0.027 

-0.032 

Mean 

-0.013 

0.0138 

o 

0.00313 

0.00262 

p 

0.928 
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2.2 OPTIONS 

Options were first traded on an organized exchange in 1973. Since then, the option 
markets have experienced a dramatic growth. There are two basic types of options, 
namely call and put options. A call (put) option gives the holder the right, but not 
the obligation, to buy (sell) the underlying asset for a prespecified price (strike price 
K) at some future time. In contrast to forward and futures contracts, options will be 
exercised only if exercising is favorable to the holders. 

Options can be further divided into American or European types. American 
options can be exercised at any time up to maturity T, while European options can 
only be exercised on the maturity date T. As the option holder will not lose anything 
in the worst situation (the option is just not exercised), a premium has to be paid 
in exchange for this privilege. The premium that must be paid to the seller is the 
fair value of the option. Derivative pricing theory studies methods of finding fair 
premiums for different kinds of financial derivatives. 

We can either long or short an option, so there are four kinds of payoffs in general, 
as summarized in Table 2.3. Figure 2.1 shows the graphs of the payoff functions. The 
payoff functions reveal some interesting properties of the options related to underly- 
ing asset. Let C A be the American call price with maturity T and strike K . C E be the 
corresponding European call, and P A and P E be the American put and European put, 
respectively. Some option properties can be derived from the following. 


Payoff Payoff 






Figure 2.1 Payoffs of option positions. 
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TABLE 2.3 Payoffs of Different Options with Strike Price K 


Type 

Call 

Put 

Long 

max(S r - K, 0) 

max(A — S T , 0) 

Short 

— maxlSj. — K , 0) 

-ma x(K - S T , 0) 


1. Upper bounds. 

Whatever the price of an underlying asset, the value of a call option (with payoff 
max (.S' — K, 0)) can never be worth more than the stock, and an American call 
is always worth more than its European counterpart because it can be exercised 
at any time, including at maturity, so we have 

Ce — Ca — S- 

For put options, no matter how low the stock price becomes, the put can never 
be worth more than the strike price: 

P E < K and P A < K. 

Furthermore, for a European put, the option will not be worth more than K at 
maturity, so the current value of the put cannot be larger than the present value 
of the strike: 


P E < Ke~' iT ~ l) . 


2. Lower bounds. 

The lower bounds for call options can be derived as follows: 

max (S - Ke- r(T - r) , 0) < C E < C A . 

To prove the aforementioned inequality, we consider two portfolios: 

A. Hold one unit of European call and K units of zero coupon bonds. 

B. Long one share of the stock. 

By comparing the stock price with the strike price on the maturity date, we can 
find the values of the two portfolios, as shown in Table 2.4. 

From the table, the value of portfolio A should be larger than that of portfolio 
B at any time. Otherwise, it is always possible to long portfolio A and short 
portfolio B to gain arbitrage. Together with the positive nature of options prices, 
we can deduce that 


S<C E + Ke-’ iT - ,) and 0 < C E , 
=> ma x(S - Ke~ r(T ~‘\ 0) < C E . 
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TABLE 2.4 Payoffs of Portfolios A and B 



S > K 

S < K 

Portfolio A 

S 

K 

Portfolio B 

S 

S 


Similarly, we can obtain the inequality for put options: 

m'dx(Ke~ r{T ~ t] - S, 0) < P E < P A . 

This inequality can also be shown by applying the put-call parity to call options. 
3. Put-call parity. 

The prices of European call and put options with the same strike and maturity 
are related by the following put-call parity formula: 

C E + Ke- r(T - t) = P E + S. (2.8) 

To prove this by the no-arbitrage principle, we construct two portfolios: 

C. Long a call option and K units of zero coupon bonds. 

D. Long a put option and one share of stock. 

At maturity, both portfolios have the same values (Table 2.5) regardless of the 
stock price, so these two portfolios should have the same present values accord- 
ing to the no-arbitrage principle. Note that the put-call parity relation only holds 
for European options. However, we can derive some inequality relations for 
American options. For non-dividend-paying assets, we have 

P A > P E and C A = C E . (2.9) 

It is never optimal to early exercise a non-dividend-paying American call 
option, because doing so will gain max ( S — K, 0), but the lower bound for the 
call option is 


S — K < S — Ke- r{ - r ->\ 

=> max ( S — K, 0) < max (S — Ke~ r ^ T ~‘\ 0) < C A . 


TABLE 2.5 Payoffs of Portfolios C and D 



S > K 

S < K 

Portfolio C 

S 

K 

Portfolio D 

S 

K 
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Therefore, holding the call contract is actually worth more than exercising the 
contract. This also proves Equation 2.9. The following example provides some 
observations on the Hong Kong Hang Seng Index options market due to the 
put-call parity. 

Example 2.5 When the interest rate is very close to zero, the put-call parity 
relation gives 


C E + Kx P E + S. 

If there is no arbitrage opportunity in the market, the aforementioned condition 
needs to be satisfied. In the case where the underlying asset value is unknown 
to us, we can also check the relation by using call and put prices with the same 
maturity for two strike prices, K j and K 2 . Let C E (K l ) be the call price with the 
strike price and denote a similar notation for P E (Kf); then we have 


C e {K,) + K, k, P E (K t ) + S, 

C e (K 2 ) + K 2 a P e (K 2 ) + S, 

=> (C#i) - C e (K 2 )) - (P^Kfi - P e (K 2 )) * K 2 - K v 

To verify this claim, we check the prices of the Hang Seng Index options on 
September 12, 2014 (Fig. 2.2) with three different maturities. Both sides of the 
aforementioned formula are evaluated in the Excel worksheet. The results show 
that the market prices closely match the put-call parity relation. 


4. Differences between American call and put prices. 

For a non-dividend-paying asset, we can further deduce the boundaries of the 
difference between the prices of American call and put options. According to 
the put-call parity, 


P A >P E 

= C E + Ke- r(T - f > - S 
= C A + Ke- r < r - 1) - S, 

^ C A — P A <S — Ke~ r{T - t) . 

To obtain the lower bound of the difference, consider the following two port- 
folios with the same maturity and same strike on the options: 

E. Long a European call and hold K units of cash. 

F. Long an American put and one unit of the asset. 
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A 

B 

C 

D 

E 

Maturity=Nov,2014 

Call Option 

Put Option 

Strike K 

24600 

25200 

24600 

25200 

Option Price 

701 

443 

667 

999 


[C(K1)-C(K2)]-[P(K1)-P(K2)]= 

590 


K2-K1= 

600 






Maturity=Dec,2014 

Call Option 

Put Option 

Strike K 

24200 

25000 

24200 

25000 

Option Price 

1051 

648 

629 

1027 


[C(K1)-C(K2)]-[P(K1)-P(K2)]= 

801 


K2-K1= 

800 






Maturity=Jun,2015 

Call Option 

Put Option 

Strike K 

26000 

26600 

26000 

26600 

Option Price 

740 

574 

2512 

2946 


[C(K1)-C(K2)]-[P(K1)-P(K2)]= 

600 


K2-K1= 

600 


Figure 2.2 The prices of Hang Seng Index options on September 12, 2014. 


When the American put is not exercised prematurely, portfolio F is worth 
max(A' — S T , 0) + S T = iriax(.S’ 7 , K) at maturity T. The value of portfolio E at 
maturity T is given by 

max(,Sy - K, 0) + Ke r{T ~ ,) = max(S r , K) + K ( e r(T ~ ,) - 1 ) . 

The value of portfolio E is larger than that of portfolio F if neither is exercised 
early. If the American put option is exercised prematurely at time r < T, the 
value of portfolio F at time r is K while portfolio E is worth C E + Ke' ir ~ T \ 
which is greater than the value of portfolio F. The payoffs of these two portfo- 
lios are summarized in Table 2.6. 

Portfolio E is worth more than portfolio F under all circumstances, so the 
present value of portfolio E should be larger than that of portfolio F: 

C E + K > P A + S. 

Note that C A = C E for non-dividend-paying assets. By rearranging the afore- 
mentioned inequality, we obtain the boundaries as follows: 

S — K < C A — P A < S — Ke 


(2.10) 
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TABLE 2.6 Payoffs of Portfolios E and F 



No Early Exercise 

Early Exercise 



of American put at r 


(Value at Maturity T) 

(Value at Exercising Time r) 

Portfolio E 

max(S r , K) + K (e r( r-f) _ |) 

C E + Ke r(T ~ r) 

Portfolio F 

max(S r , K) 

K 


TABLE 2.7 Properties of Stock Options 



c E 

Pe 

C A 

Pa 

Stock price 

+ 

- 

+ 

- 

Strike 

- 

+ 

- 

+ 

Maturity 

+/- 

+/- 

+ 

+ 

Volatility 

+ 

+ 

+ 

+ 

Risk-free rate 

+ 

- 

+ 

- 

Dividends 

- 

+ 

- 

+ 


The “+” sign indicates that the option is rising in value with an increase in the 
parameters; the ” sign represents a decrease in value; and ” represents an 
unclear influence on the price. 


The aforementioned inequality also shows that when the interest rate r ~ 0, 
S - K « C A — P A . The put-call parity also implies that C A + K « P E + S for 
non-dividend-paying assets. Therefore, we can deduce that 

C A ~ Pa ~ S — K ~ C A — P £ 

=^> P A ~ P E 

for a near-zero interest rate. To price an option, a model of the stock price usually has 
to be specified except that the option can be perfectly replicated by other securities 
in the market. The next few chapters are devoted to the Black-Scholes model. Some 
qualitative properties related to the option parameters are summarized in Table 2.7. 


2.3 EXERCISES 

1. Assume today to be March 3, 2014, and the continuously compounding inter- 
est rate is 0.4% per annum. It is known that the interest rate will increase lin- 
early over time to 1.2% until March 7, 2013. Consider a 1-year futures contract, 
a 1-year European call option, and a 1-year equity swap (ES) contract on a 
non-dividend-paying stock with a current price of $40. The ES with four trans- 
action dates on June 2, 2014; September 2, 2014; December 2, 2014; and March 
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2, 2015 will deliver one unit of the underlying stock to the holder while receiving 
a constant amount of cash on each transaction date. 

(a) What is the no-arbitrage forward price of a 1-year forward contract? Is it the 
same as the no-arbitrage futures price? Explain briefly. 

(b) What is the ES price (the constant cash amount paid to the exchange for the 
stock on each transaction date)? 

2 . A 3-month European-style derivative with the following payoff is selling at $2: 

Payoff = min [ max(33 — S T , 0), max(S r — 27, 0)]. 

At the moment, the underlying stock price is $30 and the European call prices 
with strikes, $27, $30, $33, are traded at $3.2193, $1.20, $0.2874, in order. 
Assume that the risk-free interest rate is 10% per annum for all maturities. What 
arbitrage opportunity does this create from this exotic derivative? Construct an 
arbitrage strategy in detail. 

3 . An out-range option has the same payoff as an ordinary option, except that it 
cannot be exercised if the terminal asset price falls within a predetermined range. 
A range-digital option pays the holder $ 1 if the terminal asset price falls into the 
prespecified range; otherwise, the holder receives nothing. Other things being 
equal, we introduce the following notations: 

• Out-range put = P R (K, L, U), where K is the strike price and [L, U] is the range. 

• European put = P E (K), where K is the strike price. 

• European call = C E (K), where K is the strike price. 

• Range-digital option = D(L, U), where [L, U] is the range. 

Suppose K > U > L. Show the Range-Digital-European (RDE) parity relation: 

P r (K, L, U) + (K- L)D(L, U) + {U - L)[e~ rT - 0(0, U)] 

= p e (k) + c e (l) - c E m 

4 . A 4-month European call option on a dividend-paying stock is currently selling 
for $5. The stock price is $64, the strike price is $60, and a dividend of $0.80 is 
expected in 1 month. The risk-free interest rate is 12% per annum for all maturi- 
ties. What opportunities are there for an arbitrageur? 

5. Assume that the risk-free interest rate is 4% per annum with continuous com- 
pounding and that the dividend yield on a stock index varies throughout the year. 
In February, May, August, and November, the dividend yield is 6% per annum, 
and in other months it is 3% per annum. Suppose that the value of the index on 
July 31, 2010 is 300. What is the futures price for a contract that is deliverable 
on December 31, 2010? 

6. A 1 -year-long forward contract on a non-dividend-paying stock is entered into 
when the stock price is $40 and the risk-free interest rate is 5% per annum with 
continuous compounding. 
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(a) What are the forward price and the initial value of the forward contract? 

(b) Six months later, the stock price is $45 and the risk-free interest rate is still 
5%. What are the forward price and the forward value of the contract? 

7. A company enters into a forward contract with a bank to sell a foreign currency 
for K l at time 7\. The exchange rate at time T { proves to be S 1 (> K x ). The com- 
pany asks the bank if it can roll the contract forward under T 2 {> T{) rather than 
settle at time T { . The bank agrees to a new delivery price, K 2 . Explain how K 2 
should be calculated. 

8. An ES is a contract that generalizes a forward contract. For a two-tenor ES, the 
long position will pay the short position $K at each time point 7\ and T 2 , where 
T l < 7 2 , while the short position will deliver one unit of the underlying asset S 
at both 7’| and T 2 . Let f(t,S) be the value of ES and F t be the ES price, which 
makes the ES value zero at time t < TV 

(a) What are the no-arbitrage pricing formulas for f(t, S ) and F t l 

(b) Consider that T l is 3 months from today and T 2 is 1 year from today. Sup- 
pose that the continuously compounded interest rate is a constant of 3%, 
and the underlying non-dividend-paying share is currently $10. What is the 
no-arbitrage value of Kl 

(c) What is the no-arbitrage price for an «-tenor ES? The n-tenor ES has n trans- 
action dates at Tj < T 2 < ■ ■ ■ < T n . 

9 . A minimum put option, P min , gives the holder the right to sell the less expensive 
stock between 5^ and S 2 with a strike price of K on maturity. 

(a) What is the payoff function of this option? 

(b) Alternatively, a minimum call option, C min , gives the holder the right to buy 
the less expensive stock between Sj and S 2 with a strike price of K on the 
maturity date. Given that the payoff of an exchange option, C x , is max(,S 2 — 
Si, 0), use the no-arbitrage principle to show that 

P min (t, T ) - C mm (t, T) = C x (t, T) + Ke ~ T (r ~ r) - S 2 (t). 

10 . In Figure 2.2, we can see that the prices of the options on the Hang Seng Index 
that mature in June 2015 match the put-call parity relation, and the difference in 
the prices from the put-call relation for options that mature in December 2014 is 
small, which is reasonable due to the transaction cost. For the options that mature 
in November 2014, the discrepancy in the prices from the put-call parity is not 
small. What portfolio can you construct for an arbitrage opportunity? 

11. Example 2.4 computes the minimum- variance hedge ratio for a specific data set. 
Now use the Hang Seng Index to compute the minimum-variance hedge ratio in 
an Excel worksheet, as in Table 2.2. Use the nearest 3-month daily mid-closing 
prices of the futures and the Hang Seng Index. The Excel functions AVERAGE , 
STDEV, and CORREL may be helpful in the computation. 

The solutions and/or additional exercises are available online at http://www.sta.cuhk 
.edu.hk/Book/SRMS/. 
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INTRODUCTION TO SIMULATION 


3.1 QUESTIONS 

In this introductory chapter, we are faced with three basic questions as follows: 

• What is simulation? 

• Why does one need to learn simulation? 

• What has simulation to do with risk management and, in particular, financial 
risk management? 


3.2 SIMULATION 

When faced with uncertainties, one tries to build a probability model. In other words, 
risks and uncertainties can be handled (managed) by means of stochastic models. 
However, in real life, building a full-blown stochastic model to account for every 
possible uncertainty is futile. One needs to compromise between choosing a model 
that is a realistic replica of the actual situation and choosing one whose mathematical 
(statistical) analysis is tractable. 

However even equipped with the best insight and powerful mathematical 
knowledge, solving a model analytically is an exception rather than a rule. In most 
situations, one relies on an approximated model and learns about this model with 
approximated solutions. It is in this context that simulation comes into the picture. 


Simulation Techniques in Financial Risk Management, Second Edition. Ngai Hang Chan and Hoi Ying Wong. 
©2015 John Wiley & Sons, Inc. Published 2015 by John Wiley & Sons, Inc. 
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Loosely speaking, one can think of simulations as computer experiments. It plays the 
role of the experimental part in physics. When one studies a physical phenomenon, 
one relies on physical theories and experimental verifications. When one tries to 
model a random phenomenon, one relies on building an approximated model (or an 
idealized model) and simulations (computer experiments). 

Through simulations, one learns about different characteristics of the model, 
behaviors of the phenomenon, and features of the approximated solutions. Ulti- 
mately, simulations offer practitioners the ability to replicate the underlying scenario 
via computer experiments. It helps us to visualize the model, to study the model, and 
to improve the model. 

In this book, we learn some of the features of simulations. We see that simulation is 
a powerful tool for analyzing complex situations. We also study different techniques 
in simulations and their applications in risk management. 


3.3 EXAMPLES 

Practical implementation of risk management methods usually requires substantial 
computations. The computational requirement comes from calculating summaries, 
such as value-at-risk, hedging ratio, market /, and so on. In other words, summariz- 
ing data in complex situations is a routine job for a risk manager, but the same can 
be said for a statistician. Therefore, many of the simulation techniques developed 
by statisticians for summarizing data are equally applicable in the risk management 
context. In this section, we study some typical examples. 

3.3.1 Quadrature 

Numerical integration, also known as quadrature, is probably one of the earliest tech- 
niques that requires simulation. Consider a one-dimensional integral 



(3-D 


where/ is a given function. Quadrature approximates I by calculating/ at a number of 
points x { ,x 2 , ■■■ ,x n and applying some formula to the resulting values f(x ] ), ... ,f(x n ). 
The simplest form is a weighted average 


n 


W if (*>•)’ 


where w x ,...,w n are some given weights. Different quadrature rules are distin- 
guished by using different sets of design points x 1 ,...,x n and different sets of 
weights w l , ... ,w n . As an example, the simplest quadrature rule divides the interval 
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[a, b] into n equal parts, evaluates fix) at the midpoint of each subinterval, and then 
applies equal weights. In this case. 


j = b_ JL y /(fl + (2i a )/(2n)). 

n H 

This rule approximates the integral by the sum of the area of rectangles with base 
( b — a)/ n and height equal to the value of fix) at the midpoint of the base. For n 
large, we have a sum of many tiny rectangles whose area closely approximates I in 
exactly the same way that integrals are introduced in elementary calculus. 

Why do we care about evaluating Equation 3.1? For one, we may want to calcu- 
late the expected value of a random quantity X with p.d.f. (probability distribution 
function) /(x). In this case, we calculate 


E(X) = J xf(x) dx, 

and quadrature techniques may become handy if this integral cannot be solved analyt- 
ically. Improvements over the simple quadrature have been developed, for example, 
Simpson’s rule and the Gaussian rule. We will not pursue the details in this case, 
but interested readers may consult Conte and de Boor (1980). Clearly, generalizing 
this idea to higher dimensions is highly nontrivial. Many of the numerical integration 
techniques break down for evaluating high dimensional integrals. (Why?) 


3.3.2 Monte Carlo 

Monte Carlo integration is a different approach to evaluating an integral off. It eval- 
uates /(x) at random points. Suppose that a series of points x 1 , ... ,x„ are drawn 
independently from the distribution with density g(x). Now 

1 = j '/(*) dx = ! | f{x)/g{x)\g(x) dx = E g | . (3-2) 

where E„ denotes expectation with respect to the distribution g. Now, the sample of 
points x 1? ... ,x„ drawn independently from g gives a sample of values /(x,)/g(x,) of 
the function/(x)/g(x). We estimate the integral (Eq. 3.2) by the sample mean 


/ 


i y m 

n " g(x/)' 


According to classical statistics, 1 is an unbiased estimate of I with variance 
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As n increases, I becomes a more and more accurate estimate of I. The variance 
(verify) can be estimated by its sample version, namely, 


1 y f 2 (.Xj) I 2 
n 2 “f g 2 (Xj ) n ' 


(3.3) 


Besides the Monte Carlo method, we should also mention that the idea of the 
quasi-Monte Carlo method has also enjoyed considerable attention recently. Further 
discussions on this method are beyond the scope of this book. Interested readers may 
consult the survey article by Hickernell, Lemieux, and Owen (2005). 


3.4 STOCHASTIC SIMULATIONS 

In risk management, one often encounters stochastic processes such as Brownian 
motions, geometric Brownian motion, and lognormal distributions. Although some of 
these entities may be understood analytically, quantities derived from them are often 
less tractable. For example, how can one evaluate integrals such as J Q W(t) dW(t) 
numerically? More importantly, can we use simulation techniques to help us under- 
stand features and behaviors of geometric Brownian motions or lognormal distribu- 
tions? To illustrate the idea, we begin with the lognormal distribution. 

As the lognormal distribution plays such an important role in modeling the stock 
returns, we discuss some properties of the lognormal distribution in this section. 
Firstly, recall that if X ~ N(/t, a 2 ), then the random variable Y = is lognormally 
distributed, that is, log Y = X is normally distributed with mean /,/ and variance n 1 . 
Thus, the distribution of Y is given by 

GOO = P(Y < y ) = P(X < logy) 

= P((X - n)/a < (logy - n)/a) 

= <3>((logy — /t)/er), 

where O(-) denotes the distribution function of a standard normal random variable. 
Differentiating G(y) with respect to y gives rise to the p.d.f. of Y. To calculate ET, 
we can integrate it directly with respect to the p.d.f. of Y or we can make use of the 
normal distribution properties of X. Recall that the moment-generating function of X 
is given by 

M x (t) = E(e tX ) = e^ 2 * 2 . 


Thus, 


ET = E(e x ) = M x ( 1) = e" 4 ^ 2 . 


By a similar argument, we can calculate the second moment of Y and deduce that 

Var(T) = e 2ft+ ° 2 (e” 2 - 1). 
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To produce the densities of lognormal random variables and generate 1 ,000 lognormal 
random variables in Visual Basic for Applications with p = 0 and a 1 = 1, that is, 
E Y = e 0 5 and Var(F) = e(e - 1), go to the Online Supplementary and download the 
files Chapter 3 Generate the PDF of Lognormal Random Variables and Chapter 3 
Generate Lognormal Random Variables. 

It can be seen from Figure 3.1 that a lognormal density can never be negative. 
Furthermore, it is skewed to the right and has a much thicker tail than a normal random 
variable. 

Before concluding this chapter, we would like to bring the readers’ attentions to 
some existing books written on this subject. In the statistical community, many excel- 
lent texts have been written on this subject of simulations, see, for example, Ross 
(2002) and the references therein. These texts mainly discuss traditional simulation 
techniques without too much emphasis on finance and risk management. They are 
more suitable for a traditional audience in statistics. 

In finance, there are several closely related texts. A comprehensive treatise on 
simulations in finance is given in the book by Glasserman (2003). A more succinct 
treatise on simulations in finance is given by Jaeckel (2002). Both of these books 
assume a considerable amount of financial background from the readers. They are 
intended for readers at a more advanced level. A book on simulation based on MAT- 
FAB is Brandimarte (2006). The survey article by Broadie and Glasserman (1998) 
offers a succinct account of the essence of simulations in finance. For readers inter- 
ested in knowing more about the background of risk management, the two special 
volumes of Alexander (1998), the encyclopedic treatise of Crouchy, Galai, and Mark 
(2000), and the special volume of Dempster (2002) are excellent sources. The recent 
monograph of McNeil, Frey, and Embrechts (2005) offers an up-to-date account on 
topics of quantitative risk management. 

The present text can be considered as a synergy between Ross (2002) and Glasser- 
man (2003), but at an intermediate level. We hope that readers with some (but not 



Figure 3.1 Densities of a lognormal distribution with mean e 0 5 and variance e(e — 1), that 
is, p = 0 and a 1 = 1 and a standard normal distribution. 
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highly technical) background in either statistics or finance can benefit from reading 
this book. 


3.5 EXERCISES 

1. Verify Equation 3.3. 

2. Explain the possible difficulties in implementing quadrature methods to evaluate 
high dimensional numerical integrations. 

3. Using either Splus or Visual Basic, simulate 1,000 observations from a lognor- 
mal distribution with a mean e 2 and variance e 4 (e 2 — 1). Calculate the sample 
mean and sample variance for these observations and compare their values with 
the theoretical values. 

4. Let a stock have price S at time 0. At time 1, the stock price may rise to S u with 
probability p or fall to S d with probability (1 - p). Let R s = (S { - S)/S denote the 
return of the stock at the end of period 1 . 

(a) Calculate m s = L (R s ). 

(b) Calculate v s = -^/Var (R s ). 

(c) Let C be the price of a European call option of the stock at time 0 and C x be the 
price of this option at time 1. Suppose that Cj = C u when the stock price rises 
to S u and C| = C d when the stock price falls to S d . Correspondingly, define the 
return of the call option at the end of period 1 as R c = (C, - C)/C. Calculate 
m c = E (R c ). 

(d) Show that v c = \/Var(R c ) = ^p( \ - p)(C u - C d )/C. 

(e) Let £2 = <c, ‘~ ( ' d) / ( V~V) , t h e so-called elasticity of the option. Show that 
u c = Q.u s . 

The solutions and/or additional exercises are available online at http://www.sta.cuhk 

.edu.hk/Book/SRMS/. 
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BROWNIAN MOTIONS AND ITO’S 
RULE 


4.1 INTRODUCTION 

In this chapter, we learn about the notion of Brownian motion and geometric Brown- 
ian motion (GBM), the latter being one of the most popular models in financial theory. 
In addition, the issue of Ito’s calculus is also introduced. The key element of this last 
concept is to develop an operational understanding of Ito’s calculus so that readers 
will be able to do simple stochastic integration such as W 2 (t ) dW(t). Finally, we 
learn how to simulate these processes and study their corresponding features. 


4.2 WIENER AND ITO’S PROCESSES 

Consider the model defined by 

W(t k+ {) = W{t k ) + e lk \flt, (4.1) 

where t k+l — t k = At, and k = 0 , ,N with t 0 = 0. In this equation, e tk ~ N(0, 1) are 
identical and independent distributed (i.i.d.) random variables. Furthermore, assume 
that W(t 0 ) = 0. This is known as the random walk model (except for the factor \J At, 
this equation matches with the familiar random walk model introduced in elementary 
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courses). Note that from this model, for j < k, 


k - 1 

W(l k )-W{tj)= 

i=j 


There are a number of consequences as follows: 

1. As the right-hand side is a sum of normal random variables, it means that 
W{t k ) — W(tj) is also normally distributed. 

2. By taking expectations, we have 


E(W(fjt) - W(fj)) = 0, 

k—l 

Var (W(t k ) - W (tj)) = E[^ e t . \fKt] 2 = (k -j)At = t k - t y 

i=j 

3. For fj < t 2 < ?3 < t 4 , 


W(f 4 ) - Vk(t 3 ) is uncorrelated with W(t 2 ) - VT(tj). 

Equation 4. 1 provides a way to simulate a standard Brownian motion (Wiener pro- 
cess). To see how, consider partitioning [0, 1] into n subintervals each with length 
For each number 1 in [0, 1], let [nt] denote the greatest integer part of it. For example, 
if n = 10 and t = then [nt] = [ y] = 3. Now define a stochastic process in [0, 1] as 
follows. For each t in [0, 1], define 


[nt] 

S \nl] = X € ’’ 

V” i'=l 

where c t are i.i.d. standard normal random variables. Clearly, 


(4.2) 


^[nr] — S[„f]-1 + £■[„,]— z, 

\jn 

which is a special form of Equation 4.1 with At = - and W{t) = .S'| J!( | 
we know that at t = 1 , 


(4.3) 

. Furthermore, 


n 



has a standard normal distribution. Also, by the Central Limit Theorem, we know 
that S n tends to a standard normal random variable in distribution even if the e t are 
only i.i.d. but not necessarily normally distributed. The idea is that by taking the limit 
as n tends to oo, the process .S'| J1( | would tend to a Wiener process in distribution. 
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t 

Figure 4.1 Sample paths of the process for different n and the same sequence of e j . 


Consequently, to simulate a sample path of a Wiener process, all we need to do is to 
iterate Equation 4.3. Figure 4.1 shows the simulations on the basis of Equation 4.3. 

To generate Figure 4.1 in Visual Basic for Applications, go to the Online Sup- 
plementary and download the hie Chapter 4 Generate Brownian Motion Paths with 
different n. 

To generate Figure 4.2 in Visual Basic for Applications, go to the Online Sup- 
plementary and download the hie Chapter 4 Sample paths of Brownian Motion on 
[ 0 , 1 ]. 

In other words, by taking limit as A t tends to zero, we get a Wiener process 
(Brownian motion), that is, 

dW(t ) = e{t)yfdt, 

where e(t ) are uncorrelated standard normal random variables. We can interpret this 
equation as a continuous-time approximation of the random walk model (Eq. 4.1); 
see Chan (2010). Of course, such an approximation can be dubious because we do 
not know if this limiting operation is well dehned. In more advanced courses in prob- 
ability, see Billingsley (1999), for example, it is shown that this limiting operation 
is well dehned, and, indeed, we obtain a Wiener process as a limit of the aforemen- 
tioned operation. Formally, we dehne a Wiener process W{t) as a stochastic process 
as follows. 

Definition 4.1 A Wiener process W(t) is a stochastic process that satisfies the 
following properties: 

• For s < t. W(t) — W(^) is a normally distributed random variable with mean 0 
and variance t — s. 

• For 0 < t l < t 2 < t 3 < t 4 , W{t 4 ) — W(tf) is uncorrelated with W(t 2 ) — Wfty). 
This is known as the independent increment property. 

• W(t 0 ) = 0 with probability one. 


Brownian motion 



Figure 4.2 Sample paths of Brownian motions on [0,1]. 
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From this definition, we can deduce a number of properties. 

1. Fort < s, E(W(.?)| W(f)) = E(W(s) - W(t) + W(t)\W(t)) = W(t). This is known 
as the martingale property of the Brownian motion. 

2. The process W(t) is nowhere differentiable. Consider 

r ( fW(s)-W(t)\ 2 \ 1 

This term tends to oo as s — t tends to 0. Hence, the process cannot be differ- 
entiable, and we cannot give a precise mathematical meaning to the process 
dW(t)/dt. 

3. If we formally represent |(f) = and call it the white noise process, we can 
use it only as a symbol, and its mathematical meaning has to be interpreted in 
terms of an integration in the context of a stochastic differential equation. 

The idea of Wiener process can be generalized as follows. Consider a process X(t) 
satisfying the following equation: 

dX(t) = pdt + cr dW(t), (4.4) 

where p and a are constants, and W(t) is a Wiener process defined previously. If we 
integrate Equation 4.4 over [0, t\, we get 

X(t) = X(0) + fit + <rW(t), 


that is, the process X(t) satisfies the integral equation 


/ 


dX(t) = p 



dW(t). 


The process X(t) is also known as a diffusion process or a generalized Wiener pro- 
cess. In this case, the solution X(t) can be written down analytically in terms of the 
parameters p and a and the Wiener process W(t). To extend this idea further, we can 
let the parameters p and a depend on the process Xit) as well. In that case, we have 
what is known as a general diffusion process or an Ito’s process. 

Definition 4.2 An ltd’s process is a stochastic process that is the solution to the 
following stochastic differential equation (SDE): 


dX(t) = p(x, t) dt -I- er(x, t) dW(t). 


(4.5) 


In this equation, p(x, t) is known as the drift function, and a(x, t) is known as the 
volatility function of the underlying process. Of course, we need conditions for p(x, t) 
and cr(x, t) to ensure Equation 4.5 has a solution. We do not discuss these technical 
details in this chapter; further details can be found in Karatzas and Shreve (1997) 
or Dana and Jeanblanc (2002). We will just assume that the drift and the volatility 
are “nice” enough functions so that the existence of a stochastic process {Z(t)} that 
satisfies Equation 4.5 is guaranteed. Again, this equation has to be interpreted through 
integration. 
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4.3 STOCK PRICE 

Recall the multiplicative model 

log S(k+ 1) = log S(k) 4- w(k). 

The continuous-time version of this equation is 

cl log 5(f) = v clt + a dW(t). 

The right-hand side of this equation is normally distributed with mean vdt and vari- 
ance a 2 dt. Solving this equation by integration, 

log 5(f) = log 5(0) + vt + oW(t). 

Then, E log 5(f ) = log 5( 0) + vr. As the expected log price grows linearly with f, just 
as in a continuous compound interest formula, the process 5(f ) is known as a GBM. 
Formally, we define 

Definition 4.3 Let X(t) be a Brownian motion with drift v and variance cr, that is, 

dX(t) = v dt + a dW(t). 

The process S(t) = e x ^ is called a GBM with drift parameter p, where p = v + | a 2 . 
In particular, 5(f) satisfies 


dS(t ) = pS(t) dt + crS(t) dW(t), 

and 

c/log S(t) = (p - iff 2 ) dt + adW(t). (4.6) 

To simulate 1,000 GBMs in Visual Basic for Applications with p = 0.03 and a 1 = 
0.04, go to the Online Supplementary and download the file Chapter 4 Sample path 
of Geometric Brownian Motion on [0,1], A sample path is plotted in Figure 4.3. 
Equivalently, S(t ) is a GBM starting at 5(0) = z if 

5(r) = ze x(,) = ze vt+ ° w U = 

Using this definition, we see that for t 0 < t x < ■ ■ ■ < t n , the successive ratios 

5(0) 5(0) 5(f„) 

5(r 0 )’5(r 1 ) , "”5(r„_ 1 ) 

are independent random variables by virtue of the independent increment property of 
the Wiener process. The mean and variance of a geometric Brownian motion can be 
computed as in the lognormal distribution. Notice that because a Brownian motion is 
normally distributed, we conclude the following: 
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Figure 4.3 Geometric Brownian motion. 


1. log 5(f) = X(t) ~ N(log 5(0) + vf, a 2 f). 

2. As 5(r) = S(())e x( ‘>, 

E(5(0) = E(E(5(0|5(0) = z)) = E(E(ze (vr+ff ' yw) |5(0) = z)) 

= z^-^ 2)f E( e ^( f) ) 

= ze U ‘~ l 2 a2), E(e a ^) (£ = W(0/V* ~ N(0, 1)) 

= ze ( ^-5° 2), e5 o2t 
= ze"' = 5(0)e A ". 

This equation has an interesting economic implication in the case where /r is 
positive but small relative to a 2 . On one hand, if /.i > 0, then the mean value 
E(5(0) tends to co as t tends to oo. On the other hand, if 0 < // < \n 2 , then 
the process X(t ) = X(0 ) + in — ^ a 2 )t + <yW(t) has a negative drift, that is, it 
is drifting in the negative direction as t tends to oo. It is intuitively clear that 
(which can be shown mathematically) X{t) tends to — oo. As a consequence, 
the original price 5(f) = S(Q)e x< ' , ' > tends to 0. The GBM 5(f) is drifting closer 
to zero as time goes on, yet its mean value E(5(f)) is continuously increasing. 
This example demonstrates the fact that the mean function sometimes can be 
misleading in describing the process. 

3. Similarly, we can show that 

Var(5(f)) = 5(0) 2 e 2vHV V 2r - 1) = 5(0) 2 e 2 '"(e ,A - 1). 

4.4 ITO’S FORMULA 

In the preceding section, we define 5(f) in terms of log 5(f) as a Brownian motion. 

Although such a definition facilitates many of the calculations, it may sometimes be 
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desirable to examine the behavior of the original price process 5(f) directly. To see 
how this can be done, first recall from calculus that 


d log S(t) = 


dS(t) 


We might be tempted to substitute this elementary fact into Equation 4.6 to get 


dS(t) 

— -L-- = v dt + cr dW(t). 

‘XU 

However, this computation is NOT exactly correct because it involves the differential 
dW(t). A rule of thumb is that whenever we need to substitute quantities regarding 
dW(t), there is a correction term that needs to be accounted for. We shall provide an 
argument of this correction term later. For the time being, the correct expression of 
the previous equation should be 

=iy+^a 2 )dt + a dW(t) 

= ia dt + a dW(t), (4.7) 

1 9 

as v = \jl — -a . The correction term required when transforming log 5(f) to S(t) is 
known as the Ito’s lemma. We discuss this in the next theorem. Before doing that, 
there are a number of remarks. 


Remarks 


1. The term dS(t)/S(t ) can be thought of as the differential return of a stock, and 
Equation 4.7 says that the differential return possesses a simple form ndt + 
<j dW(t). 

2. Note that in Equation 4.7, it is an equation about the ratio dS(t)/S(t). This term 
can also be thought of as the instantaneous return of the stock. Hence Equation 
4.7 is describing the dynamics of the instantaneous return process. 

3. In the case of a deterministic dynamics, that is, without the stochastic com- 
ponent dW(t) in Equation 4.7, this equation reduces to the familiar form of a 
compound return. For example, let P(f) denote the price of a bond that pays $1 
at time 1 = T. Assume that the interest rate r is constant over time and there are 
no other payments before maturity; the price of the bond satisfies 


dm 

Pit) 


rdt. 


In other words, P(t ) = P(0)e rt = e r{ ' T \ after taking the boundary condition 
P(T) = 1 into account. 
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4. Note that Equation 4.7 provides a way to simulate the price process S(t). Sup- 
pose we start at t 0 and let t k = 1 {] + kAt. According to Equation 4.7, the simu- 
lation equation is 

S{t k+ 1 ) - S{t k ) = pS(t k )At + aS{t k )e{t k ) \fKt, 

where e(t k ) are i.i.d. standard normal random variables. Iterating this equation 
we get 

S(t k+] ) = [ 1 + id At + aet k \fKt\S{t k ), (4.8) 

which is a multiplicative model, but the coefficient is normal rather than log- 
normal. So this equation does not generate the lognormal price distribution. 
However, when At is sufficiently small, the differences may be negligible. 

5. Instead of using Equation 4.7, we can use Equation 4.6 for the log prices and 
get 

logSfe+i) - !og S(t k ) = vAt + ae(t k )\fAt. 

This equation leads to 


S(t k+l ) = e^ ,+ae{f ^S{t k ), (4.9) 

which is also a multiplicative model, but now the random coefficient is log- 
normal. In general, we can use either Equation 4.8 or Equation 4.9 to simulate 
stock prices. 

With these backgrounds, we are now ready to state the celebrated Ito’s lemma, 
which accounts for the correction term. 

Theorem 4.1 Suppose the random process x(t) satisfies the diffusion equation 
dx(t) = a(x, t) dt + b(x, t ) dW(t), 

where W(t) is a standard Brownian motion. Let the process y(t ) = F(x, t) for some 
function F. Then the process y(t ) satisfies the ltd ’s equation 

... ( dF dF 1 d 2 F ,t\ , dF , 

(,* 0+ * + 2V b ) d,+ T x bdW ^ < 4J0 > 

Proof. Observe that if the process is deterministic, ordinary calculus shows that for 
a function of two variables such as y(t) = F(x, t), the total differential dy is given by 
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Comparing this expression with Equation 4.10, we see that there is an extra correction 

1 d^F 9 

term ^ —4 ft 2 in front of dt. To see how this term arises, consider expanding the func- 
tion F in a Taylor’s expansion up to terms of first order in A t. Note that as AIT and 
hence Ax are of order \J At, such an expansion would lead to terms with the second 
order in Ax. In this case, 


dF dF 1 d 1 2 F 7 

>' + Ay = F(x, t) + — Ax + — At + - — (Ax) 2 
dx dt 2 dx 2 

dF 1 d^F 

= F(x,t)+^-(aAt + bAW)+^-At + 2- ^r(aAt + bAW) 2 . 
ox dt 2 dx 2 

Now focus at the quadratic expression of the last term. When expanded, it becomes 
a 2 {At) 2 + 2ab(At)(AW) 4- b 2 (AW) 2 . 

The first two terms of the aforementioned expression are of orders higher than At, 
so they can be dropped as we only want terms up to the order of At. The last term 
b 2 {AW) 2 is all that remains. Recalling that AIT ~ N(0, At) (recall the earlier fact that 
dW(t) = e{t)\fdt), it can be shown that (AIT) 2 At. In other words, we have the 
following approximation 


dW(t) 2 = dt or dW(t) = \fdt. 
Substituting this into the expansion, we have 


y + Ay = F{x, t) + 


dF 

— o + — + 

dx dt 2 dx 2 


dF ■ ldlF b 2 )At + 


dF 

dx 


bAW. 


Taking limit as Ar — > 0 and noting y(t) = F(x, t) complete the proof. □ 

Example 4.1 Suppose S(t) satisfies the geometric Brownian motion equation 
dS(t) = pS(t) dt + (7 S(t) dW(t). 

Now use Ito’s formula to find the equation governing the process F(S{t)) = logS(f)- 
Using Equation 4.10, we identify a = pS and b = oS. Furthermore, we know that 
dF/dS = 1 /.S' and d 2 F/dS 2 = -1/S 2 . According to Equation 4.10, we get 


dlogS = 


1 dt+^-dW = {p-\o 2 )dt+odW, 
S 2 S 2 J S 2 


which agrees with the earlier discussion. 
Example 4.2 Evaluate 


I 


sdW(s). 
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To evaluate this integral, let us first guess the answer to be the one given by the clas- 


C t 

sical integration by parts formula. That is, we might guess tW(t) — J Q W(s)ds to be 


the answer. To verify it, we need to differentiate this quantity to see if it matches the 
answer. To do this, use the following steps: 

1. Let X(t ) = Wit), then dX(t) = dW{t) and we identify a = 0 and b = 1 in 
Equation 4.10. 

2. Let Y(t) = F(W(t )) = tW(t). Then dF/dW = t, d 1 2 3 F/dW 2 = 0, and dF/dt = 
W(t). 

3. Substitute these expressions into Ito’s Lemma, we have dY(t) = tdW(t) + 
W(t) dt. 

4. Integrating the preceding equation, we have 



that is, 



as required. 


Example 4.3 Evaluate 



First guess an answer, W 2 (t)/2, say. Is this answer correct? To check, we differentiate 
again and apply Ito’s Lemma. Using the recipe, 

1. Let X(t) = W(t), then dX(t) = dW(t), and we identify a = 0 and b = 1 in 
Equation 4.10. 

2. Let Y(t) = F(W(t)) = W 2 (t)/2. Then dF/dW = W, d 2 F/dW 2 = 1, and 
dF/dt = 0. 

3. Recite Ito’s Lemma: 



so that 


dY(t ) = ^ dt + W(t) dW(t). 


4. Integrating the preceding equation, we get 
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In other words, 


l 


r W 2 (t ) t 

w(s)dw(s) = 


5. This time, our initial guess was not correct. We need the extra correction term 
— from Ito’s Lemma. 


Example 4.4 Let W t be a standard Brownian motion and let Y r = W 2 . Evaluate dY t . 

Let A, = W t and F(X, t) = X 2 . Then the diffusion is dX t = dW t with a = 0 and 
b = 1 . Further 

— — = 3A~, ^=6A, — = 0. 
dX dX 2 dt 

Using Ito’s lemma, we have 

dY, = 3W t dt + 3W 2 dW t . 

Integrating both sides of this equation, we get 


3 W s ds + I 3 W 2 dW s , 


f dY s = f 3 W s ds + f 
Jo Jo Jo 

Y t = W 2 = 3 f W s ds + 3 f dW s , 
Jo Jo 


S U " J’ 


In other words, 


In general, one gets 


ft \y 3 ft 

/ W 2 dW s = — T - / W, ds. 
Jo 3 J Q 


ft VT" !+I ft 

/ W"'dW, = - ’4 / W’ n ~ l ds, m = 0, 1,2, ... . (4.11) 

Jo m +1 2 Jq 


Example 4.5 Let 


Evaluate d log X r . 


dX, = 2 X,dt + X t dW, . 


(4.12) 


From the given diffusion, we have a = y and h = X r Let Y t = F(X, t ) = log A,. 
Then 

dE _ J_ d^F 1_ dF _ 

dX ~ A’ dA 2 _ A 2 ’ dt ~ 


Using Ito’s lemma, we get dY t = d log A, = dW r . That is, Y t = W t Therefore, A, = e w ‘ 
is a solution to Equation 4.12. 
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Example 4.6 Let the diffusion be 


dX, = ^dt + dW t . 


(4.13) 


Evaluate d e x >. 


From the given diffusion, we have again a = | and b = 1 . Let Y r = F(X, t) = e x < . 
Then 

^ = A ^ = o. 

dX dX 2 dt 

Using Ito’s lemma, we get dY r = e x < dt + e x > dW t so that 

dY t =Y,dt+ Y t dW t . 

Example 4.7 Find the solution to the stochastic differential equation 


dX t =X l dt + dW„ X 0 = 0. 

Multiplying the integrating factor e~' to both sides of the SDE, we have 

dX t = e~’X t dt + e~ r dW t . 


Let Y t = e l X r Then Y 0 = 0 and by means of Ito’s lemma, we have 

dY, = e~ T dW t . 

Integrating both sides of this equation, 

Y,-Y 0 = [ e~ s dW s , 

Jo 

so that f 

X, = e ! Y, = [ e {t ~ s) dW s . 

Jo 

More generally, if we are given the SDE 

dX t = /jX t dt + 1 7 dW t , 

then using the same method by considering the process Y t = e~^X t , it can be easily 
shown that the solution to this SDE is given by the process 

X t = a f e Kt - s) dW s +X 0 . 

Jo 

Such a process is known as the Ornstein-Uhlenbeck process, which is often used in 
modeling bond prices. 
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4.5 EXERCISES 


1. Let W r be a Wiener process. Now is at time t 0 . Find the mean and variance of X, 
if 

(a) X, = a, (W t - W h ) - (J 2 (w h -W tQ ), t>t 2 > tl > t 0 . 

(b) X, = <7, (W t - W t2 ) - a 2 (w fl - W tQ ) , r > ?! > u > t 0 - 

(c) = 'Zj =l f(W tjl ) (W tj - W tjl ) , f 0 < h < ' ' ' t n = t- 

(d) Use (c) to show that 


E 



f(W T ,r)dW T 


= 0 


and 


y rt ' 2 

f(W T ,r)dW T = / E/(W r , t) 2 cIt. 
o Jo 


Notice that the aforementioned two identities are known as Ito’s identities. 


2. Let X t satisfy the stochastic differential equation 

dX. = --dt+ -dW„ 

3 2 

where X 0 = 0 and W, is a standard Brownian motion process. Define .S', = e x > so 
that S 0 = 1. 

(a) Find the stochastic differential equation that governs S r 

(b) Simulate 10 independent paths of S T for t = 1 , , 30. Call these paths .S', i = 

1, . . . , 10 and plot them on the same graph. 

(c) What can you conclude about S t for t large? 

(d) With ii =10, evaluate 

n 

S30= l 7 Ys i 30 (4.14) 

i=i 


at t = 30. 

(e) Simulate 100 independent paths and calculate Equation 4.14 with n = 100. 
What can you conclude about S 1000 when n tends to infinity? 

3. A stock price is governed by 


dS(t) = aS(t ) dt + bS{t) dW(t), 


where a and b are given constants and W(t) is a standard Brownian motion pro- 
cess. Find the stochastic differential equation that governs 

G(t) = >/s(0. 
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4. Consider a stock price 5 governed by the geometric Brownian motion process 


dS(t) 


0.10 dt + 0.30 dW(t), 


where W(t) is a standard Brownian motion process. 

(a) Using A t = 1/12 and 5(0) = 1, simulate 5,000 years of the process log S(f) 
and evaluate 

\ log 5(f) (4.15) 


as a function of t. Note that Equation 4.15 tends to a limit p. What is the 
theoretical value of pi Does your simulation match with this value? 

(b) Evaluate 

y {log 5(f) - pt} 2 (4.16) 

as a function of t. Does this tend to a limit? 

1 2 

5. Simulate a standard Brownian motion process W(t ) at grids 0 <-<-<••• < 
— < 1 with n = 10,000. Let W ; = W ( for i = 0, ... ,n with Vk(0) = 0. 

n 1 \n ) 

Suppose you want to evaluate the integral 


f 


W(s) dW(s) 


(4.17) 


via the approximating sum 


n— 1 

S f = Xm - e)Wi + eW i+l }{ W i+l - W t }. (4.18) 

i= 0 

(a) On the basis of simulated values of W t , use Equation 4.18 to evaluate 
Equation 4.17 with e = 0. Does your result match with the one obtained 
from Ito’s formula? 

(b) On the basis of simulated values of W ( , use Equation 4.18 to evaluate 
Equation 4.17 with e = This is known as the Stratonovich integral. Using 
your calculated results, can you guess the difference between Ito’s integral 
and the Stratonovich integral? 

6. Let W T denote a standard Brownian motion process. 

(a) Let Y t = F(W t ) = e w >. Write down the diffusion equation that governs Y r 

(b) Evaluate f^e Ws d,W s . 

7. Denote X t as the Brownian motion with drift p and volatility a. 

(a) Find df and dg where /(f, X) = tX, and g(t, X) = tXj. 
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(b) Financial market practitioners usually consider the time average of the 
underlying asset price when making investment decision. If the asset evolves 
as a Brownian motion X T , then the time average line can be viewed as a 
stochastic variable 



What is the distribution for A,? 

(c) Suppose X Q = 70, \x = 0.5, and a = 0.4. Simulate X x and with At = 0.01. 
What are the sample means and variances for X ] and ,4 , for 1,000 simu- 
lations? What is the covariance between the two random variables, X t and 
Ail 

(d) Comment on your simulation result. 

The solutions and/or additional exercises are available online at http://www.sta.cuhk 
.edu.hk/Book/SRMS/. 
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BLACK-SCHOLES MODEL AND 
OPTION PRICING 


5.1 INTRODUCTION 

In this chapter, we apply Ito’s Lemma to derive the celebrated option pricing for- 
mula by Black and Scholes (1973) in the early 1970s. This formula has far-reaching 
consequences and plays a fundamental role in modern option pricing theory. Imme- 
diately after Black and Scholes, Merton (1973) strengthened and improved the option 
pricing theory in several ways. To recognize their contributions, Merton and Scholes 
were awarded the Nobel prize in economics in 1997. 

What is an option? An option is a financial derivative (contingent claim) that gives 
the holder the right (but not the obligation) to buy or to sell an asset for a certain price 
by a certain date. The option that gives the holder a purchasing right is termed a 
call option , whereas the put option gives the holder the selling right. The price in 
the contract is known as the exercise price or strike price (Ky the date is known as 
the expiration or maturity ( T ). American options can be exercised at any time up to 
expiration. European options can be exercised only on the expiration date. As option 
holders are given a right, they have to pay an option premium to enter the contract. 
This premium is usually known as the option price. 

Four basic option positions are possible: 

1. A long position in a call option. Payoff = max(5' r — K, 0). 

2. A long position in a put option. Payoff = max(Al — S T , 0). 


Simulation Techniques in Financial Risk Management, Second Edition. Ngai Hang Chan and Hoi Ying Wong. 
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3. A short position in a call option. Payoff = — max(.S 7 - — K, 0). 

4. A short position in a put option. Payoff = — max (A" — S T , 0). 

Notice that the long position in a put option is different from the short position of a 
call option. A long position in an option always has a non-negative payoff, whereas a 
short position in an option always has a nonpositive payoff, but the option premium 
is collected up front. Option pricing means determining the correct option premium. 

To illustrate the Black-Scholes formula, we first discuss some fundamental con- 
cepts in a one period binomial model from which a risk-neutral argument is intro- 
duced. 

5.2 ONE PERIOD BINOMIAL MODEL 

Consider a binomial model in one period. Let S 0 and / denote the initial price of one 
share of a stock and an option on the stock. After one period, the price of the stock can 
be either uS 0 or dS 0 , where u > 1 designates an upward movement of the stock price 
and d < 1 designates a downward movement of the stock price. Correspondingly, the 
payoff of the option after one period can be either f u or f d depending on whether 
the stock moves up or down. For instance, f u = max(.S'« — K, 0) and f d = max (.Sr/ — 
K , 0) for a call option. Schematically, the one period outcome can be represented by 
Figure 5.1. 

Now consider constructing a hedging portfolio as follows. Suppose that we long 
(buy and hold) A shares of the stock and short (sell) one call option (European). 
Suppose that the option lasts for one period T and, during the life of the option, the 
stock can move either up from S 0 to uS 0 or down from S 0 to dS 0 . Furthermore, suppose 
that the risk-free rate in this period is denoted by r. The value of this hedging portfolio 
in the next period is 


AuS 0 —f u , if stock moves up, 
AdS 0 — fa, if stock moves down. 


This portfolio will be risk free if A is chosen so that the value of this portfolio is the 
same at the end of one period regardless of the stock going up or down, that is, 


AuS 0 ~fu = AdS 0 ~fd- 



uS 0 

fu 


Figure 5.1 One period binomial tree. 
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Solving for A, we get 


A = 


fu-fd 
it S, i dS Q 


As this portfolio is risk free in the sense that it attains the same value regardless of the 
outcome of the stock, it must earn the risk-free rate. Otherwise, one could take advan- 
tage of an arbitrage opportunity. For example, if the return of this hedging portfolio 
is larger than the risk-free rate, one could borrow money from the bank to purchase 
this portfolio and lock in the fixed return. After one period, the proceeds from the 
portfolio can be used to repay the loan and the arbitrageur pockets the difference. 
Consequently, the present value of this portfolio must equal (AuS 0 —f u )e~ rT . If we 
let / denote the value of the option at present, then the present value of the portfolio 
is S 0 A — /, and according to the no arbitrage assumption, 


S 0 A -/ = (A uS 0 ~f u )e- T . 


Consequently, 


/ = 'S' 0 A — (AuSq —f u )e~ rT 
— ■S'o A( 1 - ue~' T ) +f u e~ rT 

= f JLlll^ ue -rT )+f -rT 

u — cl 


-rT 


fj 'fu fd — rT\ i r 

e -(1 - Me ) +/„ 


u — d 


rT / „rT- 


•fi — fd fu — fd , , U — d 


= e~ rT (f u 


u — d 

e rT -d 
u — d 


u — d 
+fd l ^T [ ) 


+ fu 


u — d 


= e- rT [pf u + (] -p)f d \. 


e rT _d 

where p = — — . This identity has a very natural interpretation. If we let the value p, 
just defined as the probability of the stock, move up in a risk-neutral world, then the 
aforementioned formula simply states the fact that, in the risk-neutral world. 


f = e- rT E(f) = e- rT (pf u + (\ -p)f d l 


that is, the expected value of the option in one period discounted by the risk-free rate 
equals the present value of the option. Note that the expected value in this case is 
denoted by E, which is the expectation taken under the new probability measure p. 
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For this reason, p is known as the risk-neutral probability. The same reasoning can 
be used to evaluate the stock itself. Note that 

) = puS 0 + (1 - p)dS 0 
= pS 0 (u - d) + dS 0 

= e ' r ~ d SJu -d) + dSn 
u — d 

= e rT S 0 . 


In other words, the stock grows as a risk-free rate under the risk-neutral probability (in 
the risk-neutral world). Therefore, setting the probability of the stock price moving 
up to be p is tantamount to assuming that the return of the stock grows as the risk-free 
rate in a risk-neutral world. In a risk-neutral world, all individuals are indifferent to 
risk and require no compensation for risk. The expected return of all securities is the 
risk-free interest rate. It is for this reason that such a computation is usually known 
as the risk-neutral valuation, and it is equivalent to the no arbitrage assumption in 
general. 

Example 5.1 Suppose the current price of one share of a stock is $20 and in a period 
of 3 months, the price will be either $22 or $18. Suppose we sold a European call 
option with a strike price of $21 in 3 months. Let the annual risk-free rate be 12% and 
let p denote the probability that the stock moves up in 3 months in the risk-neutral 
world. Note that the payoff of the option is either f u = 51 if the stock moves up or 
f d = 50 if the stock moves down. How much is the option, f, worth today? To find f, 
we can use the risk-neutral valuation method. Recall that from the aforementioned 
discussion, 

22 p+ 18(1 —p) = 20e al2/4 , 


so that p = 0.6523. Using the expected payoff of the option, we get 
W) = Pfu + (1 -P)fd = P + (1 -F)0 = P = 0.6523. 


Therefore, the value of the option for today is 

f = e~ rT E(f) = e~ QA2 /\pf, + (1 -p)f d ) = 0.633. 

Alternatively, we can try to solve the same problem using the arbitrage-free argu- 
ment. 

Example 5.2 With the same parameters as in the preceding example, consider solv- 
ing for A. Firstly, as we want a risk-free profit for the hedging portfolio, we want to 
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purchase A shares of the stock and short one European call option expiring in 3 
months. After 3 months , the value of the portfolio can be either 

22 A — 1, if the stock price moves to $22, 
or 

18A, if the stock price moves to $18. 

This portfolio is riskfree if A is chosen so that the value of the portfolio remains the 
same for both alternatives, that is, 

22 A — 1 = 18A which means A = 0.25. 

The value of the portfolio in 3 months becomes 

22x0.25 - 1 = 4.5 = 18x0.25. 

By the no arbitrage consideration, this risk-free profit must earn the risk-free interest 
rate. In other words, the value of the portfolio today must equal the present value of 
$4.5, that is, 4.5e -CU2 / 4 = 4.367. If the value of the option today is denoted byf, then 
the present value of the portfolio equals 


20 X 0.25 -/ = 4.5e _012/4 = 4.367. 


Solving for f gives 

f = 0.633, 

which matches with the answer of the preceding example. 

In general, this principle can be applied to a multiperiod binomial tree. We do not 
go into the analysis of a multiperiod model and refer the readers to Chapter 1 1 of 
Hull (2006) for further details. For a comprehensive discussion on the discrete-time 
approach, see Pliska (1997). Although these two examples are illustrated with a call 
option, by the same token, the same principle can be used to price a put option; again 
details can be found in Hull (2006). 


5.3 THE BLACK-SCHOLES-MERTON EQUATION 

The Black-Scholes option pricing equation has initiated modern theory of finance. 
Its development has triggered an enormous amount of research and revolutionized the 
practice of finance. The equation was developed under the assumption that the price 
fluctuation of the underlying security can be described by a diffusion process studied 
earlier. The logic behind the equation is conceptually identical to the binomial lattice: 
at each moment two available securities are combined to construct a portfolio that 
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reproduces the local behavior of a contingent claim. Historically, the Black-Scholes 
theory predates the binomial lattice. 

To begin, let S denote the price of an underlying security (stock) governed by a 
geometric Brownian motion over a time interval [0, T] by 

dS= pSdt+cSdW, (5.1) 

where W is a standard Brownian motion process. Assume further that there is also a 
risk-free asset (bond) carrying an interest rate r over the time interval [0, T] such that 

dB = rBdt. (5.2) 


Consider a contingent claim that is a derivative (call option) of S. The price of this 
derivative is a function of S and t, that is, let /(,S', t) be the price of the claim at time t 
when the stock price is S. Our goal is to find an equation that models the behavior of 
f(S, t). This goal is attained by the celebrated Black-Scholes-Merton equation. 


Theorem 5.1 Using the notation just defined, and assuming that the price and the 
bond are described by the geometric Brownian motion (Eq. 5.1) and the compound 
interest rate model (Eq. 5.2), respectively, the price of the derivative of this security 
satisfies 

d f , d f , l^ 2 / 2,2 f ,,,, 

dt dS 2 dS 2 


Proof The idea of this proof is the same as the binomial lattice. In deriving the 
binomial model, we form a portfolio with portions of the stock and the bond so 
that the portfolio exactly matches the return characteristics of the derivative in a 
period-by-period manner. In the continuous-time framework, the matching is done 
at each instant. Specifically, by Ito’s lemma, recall that 


df 


l, S+ d l + \%a^ 

dS dt 2 dS 2 


df 

dt + — ctSdW. 
dS 


(5.4) 


This is also a diffusion process for / with drift (— uS + — + - <j 2 S 2 ) and diffusion 

dS dt 2 dS A 

coefficient % aS. 

dS 

Construct a portfolio of S and B that replicates the behavior of the derivative. At 
each time t, we select an amount x, of the stock and an amount y t of the bond, giving a 
total portfolio value of G{t) = x t S(t) + y t B{t). We wish to select x t and y t so that G(t) 
replicates the derivative value /(S', t). The instantaneous gain in value of this portfolio 
due to changes in security prices is 


dG = x t dS + y, dB 

= x t (pS dt + aS dW) + y t rB dt 
= (x t pS + y,rB) dt + x t <rS dW. 


(5.5) 
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As we want the portfolio gain of G(t) to behave similarly to the gain of/, we match 
the coefficients of dt and dW in Equation 5.5 to those of Equation 5.4. Firstly, we 
match the coefficient of dW in these two equations and we get 


d l 

dS' 


Secondly, as G{1) = x,S(t) + y t B(t), we get 


y t = ^- ) (G(t)-x l s m 


Thirdly, remember we want G = /, therefore, 


y, = — (f(S,t)-%S(t) 


B(t ) 


dS 


Substituting this expression into Equation 5.5 and matching the coefficient of dt in 
Equation 5.4, we have 


df l 

dS A ' 5 + B(t) 


fiS, t) ■ 




d f [d 2 f 22 

— uS 4- — + - — - a S . 

dS dt 2 dS 2 


Consequently, 


df df r , Id 2 / 2r/1 
f- + fzrS+ fr-f;(y 2 S 2 

dt dS 2 dS 2 


rf- 


Remarks 

1 . If/(S, t) = S, then ^ = 0, ^ = 1, and = 0 and Equation 5.3 reduces to rS = 
rS so that fiS, t) = S is a solution to Equation 5.3. 

2. As another simple example, consider a bond where f(S, 1 ) = e n . This is a trivial 
derivative of S, and it can be easily shown that this/ satisfies Equation 5.3. 

3. In general. Equation 5.3 provides a way to price a derivative by using the appro- 
priate boundary conditions. Consider a European call option with strike price 
K and maturity T. Let the price be C(S, t). Clearly, this derivative must satisfy 

C(0, t) = 0, 

CiS, T) = max (5 - K, 0). 

For a European put option, the boundary conditions are 
P(oo, t) = 0, 

PiS, T) = max(A" - S, 0). 


64 


BLACK-SCHOLES MODEL AND OPTION PRICING 


Other derivatives may have different boundary conditions. For a knock-out 
option that will be canceled if the underlying asset breaches a prespecified bar- 
rier level (//), in addition to the aforementioned conditions, we have an extra 
boundary condition 

f(S = H,t) = 0. 

4. With these boundary conditions, one can try to solve for the function / from 
the Black-Scholes equation. One problem is that this is a partial differential 
equation (PDE), and there is no guarantee that an analytical solution exists. 
Except in the simple case of a European option, one cannot find an analytic 
formula for the function/. In practice, either simulation or numerical methods 
have to be used to find an approximate solution. 

5. Alternatively, we can derive Equation 5.3 as follows. Construct a portfolio that 
consists of shorting one derivative and longing ^ shares of the stock. Let the 
value of this portfolio be II and let the value of the derivative be/(S, t). Then 

df 

n = -/+^S. (5.6) 

dS 

The change All in the value of this portfolio in the time interval A t is given by 

AII = -A/+^AS. (5.7) 

dS 

Recall that S follows a geometric Brownian motion so that 

AS = ^SAt+aSAW. 


In addition, from Equation 5.4, the discrete version of df is 


A /= I f -<y 2 S 2 ) At+^-crSAW. 


d l 

dS‘ 


df 

dS' 


dt 2 dS 2 

Substituting these two expressions into Equation 5.7, we get 


AH=I J±-l*LcW\ M. 
dt 2 dS 2 


(5.8) 


Note that by holding such a portfolio, the random component AW has been 
eliminated completely. Because this equation does not involve AW, this port- 
folio must equal to the risk-free rate during the time At. Consequently, 


All = rfl At, 
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where r is the risk-free rate. In other words, using Equation 5.8 and Equation 
5.6, we obtain 


/ df 1 d 2 f 

I — H - ~ (J 

\dt 2 dS 2 



A t=r(f- 


d l 

dS 


S)At. 


Therefore, 


d l + %rS + 

dt dS 2 dS 2 


rf- 


It should be noted that the portfolio used in deriving Equation 5.3 is not perma- 
nently risk free. It is risk free only for an infinitesimally short period of time. As S 
and t change, also changes. To keep the portfolio risk free, we have to change the 
relative proportions of the derivative and the stock in the portfolio continuously. 

Example 5.3 Letf denote the price of a forward contract on a non-dividend-paying 
stock with delivery price K and delivery date T. Its price at time t is given by 


f(S, t) = S — Ke- r{J -'\ 


(5.9) 


Hence, 


— = —rKe~ ri ' T ~ f> 9f 


d A 

dt 


— = 1 , 
dS 


d 2 f 

and = 0. 

dS 2 


Substituting these into Equation 5.3, we get 


-rKe~ r(T ~ t) + rS = rf. 


Thus, the price formula off given by Equation 5.9 is a solution of the Black-Scholes 
equation, indicating that Equation 5.9 is the correct formula. 

The Black-Scholes equation generates two important insights. The first one is the 
concept of risk-neutral pricing. As the Black-Scholes equation does not involve the 
drift, p, of the underlying asset price, the option pricing formula should be indepen- 
dent of the drift. Therefore, individual preferences toward the performance or the 
trend of a particular asset price does not affect the current price of the option on that 
asset. The second insight is that one would be able to derive a price representation of 
a European option with any payoff function from the equation. It is summarized in 
the following theorem. 

Theorem 5.2 Consider a European option with payoff F(S ) and expiration time 
T. Suppose that the continuous compounding interest rate is r. Then, the current 
European option price is determined by 


f(S, 0) = e- rT E[F(S T )], 


(5.10) 
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where E denotes the expectation under the risk-neutral probability that is derived 
from the risk-neutral process 


-f = rdt+odW{t). (5.11) 

Proof. Notice that the current price of the option f(S, 0) is a deterministic function 
of time 1=0 and the current asset price S. Consider a stochastic process {X,} that 
satisfies 

dX. 

X 0 = S and = rdt + o dW(t). 

x t 


Then, f(S, 0) = /(X, 0). Consider the process /(X, t) derived from the stochastic pro- 
cess of {X r }. By Ito’s lemma, the differential form of / is 


df 


df df 1 o ? d 2 f 

— + rX— + ■- (j 2 X 2 — —t 
dt dX 2 dX 2 


df 

dt + aX— dW. 
dX 


The Black-Scholes equation says that the coefficient of dt is identical to the term rf\ 
see Theorem 5.1. The total differential for the pricing function is simplified as 


which implies 


df 

df = rf dt + aX— dW, 
J J dX 


df 

df — rf dt = oX — dW. 
J J dX 


The left-hand side of the aforementioned equation can be combined with the product 
rule of differentiation to yield 


e rt d [e~ rt /(X, 0] = dw - 

This expression has an equivalent integration form, 

e- rT f(X T , T) -f{X, 0 ) = o f e~ n X dW. 

Jo 

The right-hand side is a sum of Gaussian processes so that it has an expected value 
of zero. After taking expectation on both sides, 

E \e~ rT f{X T , T) —f(X, 0)] = 0. 


/(X, 0) = e~ rT E\fiX T , T)]. 


This implies 


BLACK-SCHOLES FORMULA 


67 


By the terminal condition specified in the Black-Scholes equation,/(X r , T) = F(X r ), 
the payoff of the option contract. Hence, we have 

/(5,0) = e -' r E[F(Z r )], 

where the expectation with respect to the random variable X T is called the risk-neutral 
expectation, and the process {X r } is called the risk-neutral asset dynamics. To avoid 
confusion, financial economists always use the term “asset price process in the 
risk-neutral world ( S t )” to represent the X t in this proof. It establishes Equations 5.10 
and 5.11 and completes the proof. □ 


5.4 BLACK-SCHOLES FORMULA 

We are now ready to state the pricing formula of a European call option. A corre- 
sponding formula can also be deduced for a European put option. We first establish 
a key fact about lognormal random variables. 

Lemma 5.1 Let S be a lognormally distributed random variable such that 
log S ~ N(m, v 2 ) and let K > 0 be a given constant. Then 

E(max { S - K, 0 }) = E(S)d)(d l ) - KO(d 2 ), (5.12) 

where <!>(•) denotes the distribution function of a standard normal random variable 
and 


d l = k-logK + m + v 2 )= i (! 0 gE(f) + y), 

-log K + m if n ( S\ v 2 \ 

d '= — v — = ^ l 08 E (x)-yj- 

Proof Let g(s) denote the p.d.f. (probability distribution function) of the random 
variable S. Then 

y r OO r CO 

max(j - K , 0)g(s) ds= (s - K)g(s) ds. 

o Jk 

By definition, as log S ~ N(m, v 2 ), 

E(5) = e (m+ 2 v } so that logE(S) = m+ ^v 2 . 

Define the variable Q as 

log S — m 

Q = — so that Q ~ N(0, 1). 


v 
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1 — — 

The p.d.f. of Q is given by cp(q) = —j=e 2 , the p.d.f. of a standard normal random 

y2 n 

variable. Since q = logs s = e m+qv so that dq = —. Therefore, 


E(max(.S' 


-*, 0 )) = [ 

Jk 

v 

■/ 


max(s — K , 0 )g(s) ds 

(e m+qv - K)g{e m+qv )sv dq 


2 (log K—m) 


{e m+qv _ K)4>(q)dq 


-(log K-m) 


l e m+q ' 1 4>(q) dq — K f 

J-Ooi’K-m) J- 


4>(q) dq 


' - (log K—m) 

i -11. 


- (log K—m) 


(5.13) 


Note that the third equality follows from the fact that the p.d.f. g of a lognormal 
random variable S has the form 


g(s) = </> ( — — ) — , so that g(e m+qv )sv = </)(q). 


V / sv 


(5.14) 


We now analyze each of the terms I and II in Equation 5.13. Consider the first term, 

$(q ~ v) d(q - v) 


= M l 


= e m+ ^<& 


For the second term, we have 


— (log K m) v 


Tv). 


I - l°g K + m 


II 


-*r 

Jk 


4>(q) dq = K<3> 


■ log K + m 


-(log K-m) 

Substituting these two expressions into Equation 5.13, we have 


v 2 ( — log K + m \ ( — log K + m 

E(max(S - K, 0)) = e m+ T(t> + v ) - K<Z> 1 B 


V 


V 
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Observe that because log E (S/K) = — log K + m + 


id 
2 ’ 


- log K + m - log K + m + v 2 

h v = 

V V 

= i(logE(S/*)+^0 

= d 1 . 


Similarly, it can be easily shown that 

- log K + m 
d 2 = • 

v 

This completes the proof of the lemma. □ 


Using this lemma, we are now ready to state the Black-Scholes pricing formula. 

Theorem 5.3 Consider a European call option with strike price K and expiration 
time T. If the underlying stock pays no dividends during the time [0, T] and if there 
is a continuously compounded risk-free rate r, then the price of this contract at time 
0, f(S, 0) = C(S, 0), is given by 

C(S, 0) = SO(d{) - Ke- rT d>(d 2 ), (5.15) 


where d>(x) denotes the cumulative distribution function of a standard normal random 
variable evaluated at the point x, 

d x = [log(.S/ K) + (r + o 2 /2)T]— 

g\T 

d 2 = [log(S/K) + (r - a 2 /2)T]-^— 

C 7\T 

= d x -o\fr. 


Proof. The proof of this result relies on the risk-neutral valuation. By Theorem 5.2, 
we have 

CCS) = e-' T E(max { S T - K, 0 } ), (5.16) 

where S T denotes the stock price at time T, E denotes the risk-neutral expectation, 
and 

dS = rSdt+aSdW, (5.17) 


ES r = S 0 e rT . 


In this case, we have 


(5.18) 


70 


BLACK-SCHOLES MODEL AND OPTION PRICING 


From the preceding lemma, we get 

E(max{S/ - tf,0}) = E(5 , r )<l>(^ 1 ) - K<&(d 2 ). 

The remaining job is to identify d ] , d 2 , and E(S r ). By construction, E.S 7 = S 0 e rT . 
Recalling from Equation 5.17, we can easily deduce from Ito’s lemma that 


1 2 


Consequently, 


d log S t = y dt + c t dW t , with y = r — -cr 


m = E(log5 r ) = logSg + rT - -<r 2 r, 


v 2 = Var(log5 r ) = g 2 T. 
According to the lemma, 

— log K + m + v 2 


d,= 


1 


a\fr 

1 

a\fr 


■ log K + log S 0 4- (V- i(7 2 ^) 


T + rj z T 


log 


(50 

+ (r+ -(7 2 ) T 


V 2 / 


By similar substitutions, it can be easily shown that 

d 2 = [log(.S'/ K) + (r - o 2 /2)(T)}— l — = d x - a \[ t. 

(j\T 


(5.19) 


This completes the proof of the Black-Scholes formula (Eq. 5.15). 


Example 5.4 Consider a 5-month European call option on an underlying stock with 
a current price of% 62, strike price $60, annual risk-free rate 10%, and the volatility of 
this stock is 20% per year. In this case, S = 62, K = 60, r = 0.1, cr = 0.2, and T = ■ 
Applying Equation 5.15, we get 


O.2V5/T2 

= 0.641287, 
d 2 = d l - 0.2V5/12 = 0.512188. 

From the normal table, we get Oft/j ) = 0.739332 and Q>(,df) = 0.695740. Conse- 
quently, 

C = (62X0.739332) - (60)e“ ( °' 1X5/12) (0.695740) = 5.798. 


-( 1 ) 


+ 0.1 + 


0 . 2 2 


ji_ 

12 
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Remarks 

1. Note that the Black-Scholes pricing formula is derived using a risk-neutral 
valuation argument in this case. Alternatively, for a given derivative such as a 
European call option, we can try to solve the PDE given by the Black-Scholes 
Equation 5.3 subject to the explicit boundary conditions given in Remark 3 in 
Section 5.3. This was the original idea of Black and Scholes, and it is commonly 
known as the PDE approach. Although feasible, due to the complexity of the 
PDE of the Black-Scholes equation, the risk-neutral valuation argument offers 
a more intuitive approach on the basis of the arbitrage-free argument. 

2. For a European put option, the corresponding pricing formula is given by 

P = Ke- rT <$>(-d 2 ) - S 0 O{-dy), 

where r, K, dy , and d 1 are defined as in Equation 5.15. 

3. To interpret the Black-Scholes formula, look at what happens to dy and d 2 as 
T -> 0. If S 0 > K, they both tend to oo so that <!>(<? i ) = 0 (^ 2 ) = 1 and < t > (— dy ) = 
<t>(-c/ 2 ) = 0. This means that 

C = S 0 -K and P = 0. 

On the other hand, if S 0 < K, the reverse argument shows dy and d 2 tend to —00 
as T — >■ 0 so that 

C = 0 and P = K - S 0 . 

Is this reasonable? When S 0 > K, and when T = 0, the call option should be 
worth S 0 — K and the put option is of course worthless. On the other hand, if 
S 0 < K and T = 0, the put option should be worth K — S 0 and the call option 
becomes worthless. Thus, the Black-Scholes formula offers the price that is 
consistent with the boundary condition. 

4. What happens when T -> 00 ? In this case, dy = d 2 = 00 and C = S Q ,P = 0. 
This is known as the perpetual call. If we own the call for a long time, the 
stock value will almost certainly increase to a very large value so that the strike 
price K is irrelevant. Hence, if we own the call we could obtain the stock later 
for essentially nothing, duplicating the position we would have if we initially 
bought the stock. Thus, C = S 0 . 

5. The Black-Scholes formula is derived for a European call option under the 
situation where the stock pays no dividends. When the underlying stock does 
pay dividends at a specific time during the life of the option, a similar formula 
to price the option can also be deduced. Again, we refer the interested readers 
to Hull (2006) for further details. 

6. For an American option where early exercise is allowed, one can no longer find 
an exact analytic formula such as Equation 5.15 for the price of a call. Instead, 
a range of possible values can be deduced, and details are given in Hull (2006). 
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7. In using the Black-Scholes formula, one important quantity required is the 
value of (7, the volatility or the risk of the underlying stock. To use the for- 
mula, we can estimate a from the historical data and put this estimate into the 
Black-Scholes equation. Such an approach is known as the historical volatility 
approach. On the other hand, one can also use the Black-Scholes formula to 
imply the value of c, known as the implied volatility. In this latter approach, 
we substitute the observed price of the derivative as the real price into the 
Black-Scholes formula to solve for a , giving it the name of implied volatility. 
This quantity can be used to monitor the market’s opinion about the volatility 
of a particular stock. Analysts often calculate implied volatilities from actively 
traded options on a certain stock and use them to calculate the price of a less 
actively traded option on the same stock. 

5.5 EXERCISES 

1. A company’s share price is now $60. Six month from now, it will be either $75 
with risk-neutral probability 0.7 or $50 with risk-neutral probability 0.3. A call 
option exists on the stock that can be exercised only at the end of 6 months with 
exercise price of $65. 

(a) If you wish to establish a perfectly hedged position, what would you do? 

(b) Under each of the two possibilities, what will be the value of your hedged 
position? 

(c) What is the expected value of option price at the end of the period? 

(d) What is the reasonable option price today? 

2. Consider the binomial model of Section 5.2. 

(a) Show that the European call option price of the two period model is given by 

c 2 = [P 2c uu + 2 P( l ~P)Cud + (1 - P) 2 c dd ] e~ rT , 

where T is the option maturity and 

c uu = max(.S'M 2 - K, 0) 
c ud = niax(.S';«/ - K, 0) 
c dd = maxf.SV/ 2 — K, 0). 


(b) Show by induction that the /(-period call price is given by 

n 

c n = e~ rT Yj {„cy'(l - q)"- j max (Su’d n ~ j - K, 0) } . 

7=0 

(c) Cox, Ross, and Rubinstein (CRR, 1979) propose that u = e n ^' and 

cl = , where a is the annualized asset volatility, are respectively 
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appropriate choices for the upward and downward factors in implementing 
the binomial model. Show that 

lim c n = SOidJ - Ke- rT <b(d 0 ), 

n — >oo 


the Black-Scholes call price, if the CRR proposal is adopted. 
3. By Theorem 5.2, show the put-call parity relation 

p + S = c + Ke~ rT . 


4. 


A fixed strike geometric Asian call option has the payoff function max(G r — K, 0) 
where 


G T = exp 



By Theorem 5.2 and Lemma 5.1, determine the analytical solution for the fixed 
strike geometric Asian call option. (Hints: 1. Apply the result of question 7(b) of 
Chapter 4, 2. You can find the answer in Chapter 9.) 

5. Consider the PDE: 


df 1 9 d 2 f df 

-jr + -o-(r, x)~4 + + a(t, x)f = 0, 

at 2 dx 1 ox 

f(T,x) = F(x). 


By modifying the proof of Theorem 5.2, show that 


f(t,x) = E 


e f> T aMdr F(X T ) 


where X T is the solution to the SDE: 

dX = p(r,X)dr + o(t, X) dW T , X t = x. 

This result is called the Feynman-Kac formula. 

6. Suppose the risk-free interest rate and the volatility of an asset are deterministic 
functions of time. That means, 

r = r(t) and a = a(t). 

(a) Show that the Black-Scholes equation governing European option prices, 
/(r, S ), is given by 

df 1 , , d 2 f df 

2 + _ (7W _2 + , (f ) ^_ K0/ = 0 . 
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(b) Show that the European call option price satisfies: 


f{t, S) = e~f> r dr E[max(5' r - K, 0)], 


where 


dS T = dr + dW T , r > t, and S, = S. 


Hint: Use the result of question 5. 
(c) Hence, show that 


fit, S) = C BS (t, S',r= r,(j = <j). 


where C BS is the Black-Scholes formula for call option with constant param- 
eters. 




and 


7. A stochastic process X(t ) is said to be a martingale under a probability measure V 
if E p [X(r)|X(s), s < t] = X(r), with probability one. 

(a) Consider the asset price dynamics under the risk-neutral measure: 


dS = rSdt + aS dW. 


Show that X(t) = S(t) e rl is a martingale. 

(b) Denote C{t, S', T ) as the Black-Scholes formula for a European call option 
with maturity T. Show that Ce r, ' r ~ t) is a martingale. 

The solutions and/or additional exercises are available online at http://www.sta.cuhk 
. edu . hk/B ook/SRMS/. 
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6.1 INTRODUCTION 

The first stage of simulation is the generation of random numbers. Random 
numbers serve as the building block of simulation. The second stage of simu- 
lation is the generation of random variables on the basis of random numbers. 
This includes generating both discrete and continuous random variables of 
known distributions. In this chapter, we study techniques for generating random 
variables. 


6.2 RANDOM NUMBERS 

Random numbers can be generated in a number of ways. For example, they were gen- 
erated manually or mechanically by spinning wheels or rolling dice in the old days. 
Of course, the notion of randomness may be a subjective judgment. Things that look 
apparently random may not be random according to the strict definition. The mod- 
ern approach is to use a computer to generate pseudo-random numbers successively. 
These pseudo-random numbers, although deterministically generated, constitute a 
sequence of values having the appearance of uniformly (0, 1) distributed random vari- 
ables. 

One of the most popular devices to generate uniform random numbers is the con- 
gruential generator. Starting with an initial value x 0 , called the seed, the computer 
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successively calculates the values x n , n> 1 via 


x n = ax n _ | + c modulo m, (6.1) 

where a, c, and m are given positive integers, and the equality means that the value 
ax n _ l + c is divided by m and the remainder is taken as the value of x n . Each x n is 
either 0, 1, . . . ,m — 1, and the quantity — is taken as an approximation to the values 
of a uniform (0, 1) random variable. As each of the numbers x n assumes one of the 
values of 0, 1 1 , it follows that after some finite number of generated values, 
a value must repeat itself. For example, if we take a = c = 1 and m = 16, then 


x n = x n _j + 1 modulo 16. 

With x 0 = 1 , then the range of x n is the set 

{0, 1,2, 3, 4, 5, 6, 7, 8,9, 10, 11, 12, 13, 14, 15,0,...}. 

When o = 5, c = 1, and m = 16, then the range of x n becomes 

{0, 1,6, 15, 12, 13,2, 11,8,9, 14,7,4,5, 10,3,0,...}. 

We usually want to choose a and m such that for any given seed x 0 , the number of 
variables that can be generated before repetition occurs is large. In practice, one may 
choose m = 2 31 — 1 and a = 7 5 , where the number 31 corresponds to the bit size of 
the machine. 

Any set of pseudo-random numbers will by definition fail on some problems. It is 
therefore desirable to have a second generator available for comparison. In this case, 
it may be useful to compare results for a fundamentally different generator. 

From now on, we will assume that we can generate a sequence of random numbers 
that can be taken as an approximation to the values of a sequence of independent 
uniform (0, 1) random variables. We do not explore the technical details about the 
construction of good generators; interested reader may consult F’Ecuyer (1994) for 
a survey of random number generators. 


6.3 DISCRETE RANDOM VARIABLES 

A discrete random variable X is specified by its probability mass function given by 

!\x = Xj ) = Pj , j = 0,1, , ^Pj = y. 


( 6 . 2 ) 
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To generate X, generate a random number U, which is uniformly distributed in (0, 1) 
and set 

x 0 if U < p 0 , 

*t if p Q <U Kpo+py, 


X = 


if l J i= oP i <u<'E J i=oPi , 


Recall that for 0 < a < b < 1, P(a < U < b) = b — a. Thus, 

P(X = Xj ) = P (^Pi < U < ^ p^\ = pj , (6.3) 

\ 7=0 1=0 / 

so that X has the desired distribution. Note that if the x t are ordered so that x 0 < x { < 
■ ■ ■ and if F denotes the distribution function of X, then F(x k ) = ^ =0 Pj and so 

X equals to x ; if xj_ l < F~ l (U) < Xj. 


That is, after generating U, we determine the value of X by finding the interval 
[F(xj _ t ), F(Xj)) in which U lies. This also means that we want to find the inverse 
of F( U ) and thus the name of inverse transform. 

Example 6.1 Suppose that we want to generate a binomial random variable X with 
parameters n and p. 

The probability mass function of X is given by 

Pi = p(x = o = n! .y a - P ) n -\ i = o, i, ... , n . 

i\(n — 1 )\ 

From this probability mass function, we see that 

n — i p 


The algorithm goes as follows: 

1. Generate U. 

2. If U < p 0 , set X = 0 and stop. 

3. If p 0 < U < p 0 + p x , set X = 1 and stop. 

4. If/r 0 + • • • +p n _i < U < p 0 + ■ ■ ■ + p n , set X = n and stop. 

Recursively, by letting i be the current value of X, pr = p t = l\X = i), and F = F(i) = 
P(X < i), the probability that X is less than or equal to i, the aforementioned algorithm 
can be succinctly written as: 
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Step 1: Generate U. 

Step 2: c = p/( 1 — p), i = 0, pr = (1 —p) n , F = pr. 

Step 3: If U < F, set X = i and stop. 

Step 4: pr = [c(n — i)/(i 4- 1 )]pr, F = F + pr, i = i + 1. 

Step 5: Go to Step 3. 

To generate binomial random variables X with parameters n = 10 and p = 0.7 in 
Visual Basic for Applications (VBA), go to the Online Supplementary and download 
the file Chapter 6 Generate Binomial Random Variables Bin(10,7). 


6.4 ACCEPTANCE-REJECTION METHOD 

In the preceding example, we see how the inverse transform can be used to generate 
a known discrete distribution. For most of the standard distributions, we can simulate 
their values easily by means of standard built-in routines available in standard pack- 
ages. However, when we move away from standard distributions, simulating values 
become more involved. One of the most useful methods is the acceptance-rejection 
algorithm. 

Suppose that we have an efficient method, for example, a computer package, to 
simulate a random variable Y having probability mass function { q r j > 0} . We can use 
this as a basis for simulating a distribution X having probability mass function (p/J > 
0} by first simulating Y and then accepting this simulated value with a probability 
proportional to p Y / q Y . Specifically, let c be a constant such that 


Pj 

— < c for all j such that p. > 0. 
<?/ 


Then we can simulate the values of X having probability mass function p f = P(X = j) 
as follows: 

Step 1: Simulate the value of Y from q y 
Step 2: Generate a uniform random number U. 

Step 3: If U < — , set A = Y and stop. Otherwise, go to Step 1. 

rq Y 

Theorem 6.1 The acceptance-rejection algorithm generates a random variable X 
such that 

P(X=j)= Pj , j = 0,1, ... . 

In addition, the number of iterations of the algorithm needed to obtain X is a geomet- 
ric random variable with mean c. 
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Proof. First consider the probability that a single iteration produces the accepted 
value j. Note that 

P(Y =j, it is accepted) = P(Y = j)P( accepted \Y = j) 

= qjP(U < Pj/icqfj) 

= qjPj/icqj) 

= Pj /c. 

Summing over j, we get the probability that a generated random variable is 
accepted as 

P( accepted) = ^ pjc = 1/c. 

j 

As each iteration independently results in an accepted value with probability 1 /c, the 
number of iterations needed is geometric with mean c. Finally, 

P(X = j) = V P(J accepted on iteration n) 

n 

= X (1 “ [ / c )' l ~ l Pj/ c = Pj- □ 

n 

Example 6.2 Suppose that we want to simulate a random variable X taking values 
in {1,2,..., 10} with probabilities as follows: 


i 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

P(X = i 

) 0.11 

0.12 

0.09 

0.08 

0.12 

0.1 

0.09 

0.11 

0.07 

0.11 


Using the acceptance-rejection method, first generate discrete uniform random vari- 
ables over the integers { 1, ... , 10}. That is, P(Y = j) = qj = 1/10 for j = 1, ... , 10. 

Firstly, compute the number c by setting c = max — = 1.2. Now generate a discrete 

‘tj 

uniform random variable Y by letting Y = 1 1 0 U t ] + 1, where U { ~ t/(0, 1). Then 
generate another U 2 ~ U( 0, 1) and compare if U 2 < p Y /(cq Y ). If this condition is 
satisfied, then X = Y is the simulated value. Otherwise, repeat the steps again. 

To generate the random variables and see the code in VBA, go to the Online Supple- 
mentary and download the file Chapter 6 Example 6.2 Generate a RV with Support 
{ 1 , 2 ,..., 10 }. 


6.5 CONTINUOUS RANDOM VARIABLES 

Generating continuous random variables is very similar to generating discrete random 
variables. It again relies on two main approaches using uniform random numbers: the 
inverse transform and the acceptance-rejection method. 
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6.5.1 Inverse Transform 

Theorem 6.2 Let U be a uniform (0, 1 ) random variable. For any continuous dis- 
tribution function F, the random variable X defined by X = F~ l (U ) has distribution 
F. In this case, 

F~ l (u) = inf{x : F(x) > u}. 

Proof Let F x denote the distribution of X = F~ l (U). Then 

F x (x) = P(X < x ) 

= P(F-\U) < x) 

= P(U < F(x)) 

= Fix). □ 


Example 6.3 LetX be an exponential distribution with rate 1. Then its distribution 
function is given by F(x) = 1 — e~ x . Let x = F~ l (u), then it = F(x) = 1 — e~ x , so that 
x = — log(l — u). Thus, we can generate X by generating U and setting X = — log( l — 
U). Moreover, because ( 1 — U) has the same distribution as U, which is uniform (0, 1 ), 
we can simply set X = — log U. Finally, it can be seen easily that if Y ~ exp(/l), then 
E(T) = 1/ X and Y = X/ X, where X ~ exp(l). In this case, we can simulate Y by first 
simulating U and setting Y = — log U. 

The previous example illustrates how to apply the inverse transform method when 
the inverse of F can be written down easily. The following example demonstrates the 
case when the inverse of F is not readily available. 

Example 6.4 Let X ~ F(n, A). Then it has distribution function 


F x (x) = 



Ae-^CAy )"- 1 
(n ~ 1)! 


dy. 


Clearly, finding the inverse ofF x is not feasible. But recall that X = 2/=i Yj, where 
Yj ~ F(l, A) are i.i.d. (identical and independent distributed). Furthermore, each Y f 
has distribution function 



Ae ks ds. 


which is the distribution function of an exponential distribution with rate A. Therefore, 
we can generate X via 

X = — log C/j j log U n = -j \og(U ] ■ ■ ■ U n ). 
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To generate a random variable X that follows a gamma distribution with parameters 
n = 5 and X = 10 in VBA, go to the Online Supplementary and download the file 
Chapter 6 Generate Gamma Random Variables. 

The message from these two examples is that, although the inverse transform 
method is simple, we may need to conduct certain simplifications before applying 
the method. 


6.5.2 The Rejection Method 

Suppose that we can simulate from a density g easily. We can use this as a basis 
to simulate from a density f{x) by first generating Y from g and then accepting the 
generated value with probability proportional to f{Y)/g{Y). Specifically, let c be such 
that 

f(y) . f ,, 

< c for all y. 

g(y) 

Then we generate from/ via the following algorithm: 


Step 1 : Generate Y from a density g. 

Step 2: Generate a uniform random number U. 

Step 3: If U < : = h(y) set X = Y, else go to Step 1. 

This is exactly the same acceptance-rejection method as in the discrete case. Corre- 
spondingly, we have the following result whose proof is almost the same as in the 
discrete case. 

Theorem 6.3 The random variable X generated by the rejection method has density 
f. Moreover, the number of iterations that this algorithm needs is a geometric random 
variable with mean c. 


Proof. Let f(x) = cg(x)h(pc), where c > 1 is a constant, g( x) is also ap.d.f. (probability 
distribution function) and 0 < h(x) < 1. Let V have p.d.f. g and U ~ 17(0, 1). Consider 


M*\U< h(Y)) = 


P(U < h(Y)\Y = x)g(x) 
P(U < h(Y)) 


For the first part in the numerator, we have 


P(U < h(Y ) | Y = x) = P(U < h(x)) = h(x). 
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For the denominator, consider 


P(U < h(Y )) = 


/ 

/ 

/ 


P(U < h(Y\Y = x ))g(x)dx 


h(x)g(x) dx 

—fix) dx 
c 


1 

c 


Therefore, f Y (x\U < h(Y)) = h(x)g(x)c = fix). 


□ 


One of the difficulties in using the rejection method is determining the constant c. 
Our goal is to find the function cg(x) so that cg(x) > fix) and sample easily from the 
density g(x). This can be achieved using trial-by-error or, in certain circumstances, 
by simple analysis, as illustrated in the following example. 

Example 6.5 Suppose that we want to simulate from the density 

f(x) = 2(k(l -x) 3 , 0 < x < 1. 

First note that f is defined only on the interval (0,1). We may try g that can be sim- 
ulated easily over the same interval, uniform (0, 1), say, that is, g(x) = 1, 0 < x < 1. 
To determine the smallest number c such that f(x) / g(x) < c for all 0 < x < 1, we 
first find the maximum value of the ratio f(x) / g(x) = 20x(l — x) 3 . Using calculus, 
differentiating, and setting to zero, 


d_ //(x)\ 
dx \g(x) J 


= 0 , 


we solve x = 1/4 to be the maximum off/g. Thus, 


Therefore, 


J -f4 < 20( 1 / 4)(3 / 4) 3 = 135/64 = c. 

g(x) 


fix) 

Cg(x) 


= 20(64)/(135)x(l -x) 3 . 


The algorithm becomes: 

Step 1: Generate random numbers If and U 2 - 

Step 2: If U 2 < - Uffi, stop and setX = U 1 . Else go to Step 1. 
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To simulate from this distribution in VBA, go to the Online Supplementary and down- 
load the file Chapter 6 Example 6.5 Generate Random Variables using Acceptance 
Rejection Method. 


6.5.3 Multivariate Normal 

An important application of simulation is to handle high dimensional problems. High 
dimensional problems are usually related to multivariate normal distributions (Gaus- 
sian distribution). However, most software packages do not provide algorithms for 
generating multivariate normal random variables. This section studies algorithms for 
generating multivariate normal random variables. 

A random vector X is said to follow a multivariate normal distribution if all of its 
elements are normal random variables. The distribution of X is then described as 

N (m,X), (6.4) 

where m = E[X] is the mean vector and X = Var[A] is the variance-covariance 
matrix. Consider a vector X = (X l5 ... , X n ) T with X i ~ N(/r ; , <r 2 ). In this case, 
the mean vector m = {p x , ... , p n ) T and the n X n matrix X = [Cov(A), A ; )] , i,j = 
1, ... ,n. 

There is a convenient way to generate a normal random vector X when X = I. X = I 
indicates that the elements of X are independent random variables. Therefore, we can 
generate X f independently and then stack them up to form the vector X. For a normal 
random vector with dependent components, that is, X / /, decomposition methods 
are useful. 

6.5. 3.1 Cholesky Decomposition The first method is the Cholesky decomposition. 
Consider two correlated standard normal random variables X { and X 2 with correlation 
coefficient p, written as, 


*1 

~ N ( 

0 


1 

p 

*2 

0 


p 

1 


Theorem 6.4 Correlated random variables X t and X 2 can be decomposed into two 
uncorrelated random variables Zj and Z 2 through the linear transformation: 


In other words, 


Z, =A, 


Z 2 = 


X 2 — pX { 

V 1 - p 2 


p 


V 1 - p 2 


(6.5) 
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where 


Z 



0 

0 


1 

0 



Proof. As A | and X 2 are linear combinations of normal random variables, they are 
also normally distributed. Furthermore, 


E(A,) = E(A 2 ) = 0, 

VarC*! ) = 1, Var(A 2 ) = (1 - p 2 ) Var(Z 2 ) + p 2 Var (Zj) = 1, 

Covpq, A 2 ) = Cov(Z[,Z 2 \/ 1 - p 2 + pZ l ) = p. 

Thus, A | and X 2 have the desired distribution. □ 

The linear transformation of Equation 6.5 is called the Cholesky decomposition. It 
enables us to generate (X l ,X 2 ) by the following procedures. 

Step 1: Generate Z X ,Z 2 ~ N(0, 1) i.i.d.. 

Step 2: Set A, = Z, and X 2 = Z 2 sj\- p 1 + pZ { . 

In fact, there is a Cholesky decomposition forN(/«, Z). As Z is a semi-positive definite 
matrix, that is, v T l,v > 0 for all vector v. there exists a lower triangular matrix L 
such that Z = LL T . The Cholesky decomposition is an algorithm to obtain this lower 
triangular matrix L. 

For nXn matrices Z = [«■■] and L = [L], the Cholesky decomposition algorithm 
works as follows. 


Step 1: Set l u = y / cT j"j\ 

Step 2: For j = 2, . . . , n set / ;1 = « y l /l {l . 

Step 3: For i = 2, ... ,n — \ conduct Step 4 and Step 5. 


Step 4: Set l u = 




n 1/2 


Step 5: For j = i + 1, . . . , n, set l jt = - - 2*=i [jk l ik 

1/2 


Step 6: Set l lm = 


- y "- 1 /- 


n-l , 2 

nk 


Given the matrix L, a random vector A ~ N(m, Z) is generated by 
X = m + LZ 1 Z~ N(0, /). 


( 6 . 6 ) 


To perform Cholesky decomposition and generate multivariate normal random vari- 
ables in VBA, go to the Online Supplementary and download the file Chapter 6 
Cholesky Decomposition. 

Theorem 6.5 The X obtained in Equation 6.6 follows K(tn, Z). 
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Proof. The random vector X has a Gaussian distribution as it is a linear combination 
of Gaussian random variables. Therefore, it suffices to check the mean and variance 
of X. For the mean, 

E[Z] = m + E[LZ] = m. 


For the variance, 


Var[X] = Var[LZ] = L(Var[Z])L 7 ' = LL T = Z. 


□ 

Example 6.6 Consider a portfolio of three assets: P(t) = Sft) + 2S 2 (J) + 3S 3 (t). 
The current assets values are ^(O) = 100, S^O) = 60, and S 3 (0) = 30. Suppose that 
the rate of returns of three assets follows a multivariate normal distribution. Specifi- 
cally, we let 

Sit + At) - Sit) T 

Rft)=- and W) = (R l (t),R 2 (t),R 3 (t)) T , 

•JjW 


where 


R(t) 


0AAt + 0.2s/AtX l 
-0.03A t + 0 As/At X 2 
0.2At + 0.25 \[AtX 3 


*1 


( 

' 0 ' 


1 

-0.1 

0.2 ' 


*2 

~ N 


0 

, 

-0.1 

1 

0.1 


z 3 


\ 

0 


0.2 

0.1 

1 



Simulate 10 sample paths of the portfolio with At = 1/ 100. 


To see the simulation of 10 sample paths of the portfolio in VBA, go to the Online 
Supplementary and download the file Chapter 6 Example 6.6 Simulating 10 paths of 
the portfolio. 

Two graphs are produced by the programme. Figure 6.1 plots 10 portfolio sample 
paths against time. Figure 6.2 plots one sample path for each individual assets and 
one sample path of the portfolio. Asset and the portfolio can be identified by their 
initial values. 


6.5. 3. 2 Eigenvalue Decomposition The second method is the eigenvalue decom- 
position. Given an n X n matrix Z, if a constant value X and a nonzero vector v satisfy: 

Y.v = Xv, (6.7) 

then X is called an eigenvalue of the matrix Z and v is the corresponding eigenvector. 
In principle, there are n eigenvalues for an n x n matrix. For the variance-covariance 
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Figure 6.1 Sample paths of the portfolio. 



Figure 6.2 Sample paths of the assets and the portfolio. 


matrix E, we know that all eigenvalues are non-negative and eigenvectors are orthog- 
onal because E is semipositive definite. 

In multivariate analysis, eigenvalues of a variance-covariance matrix E are arranged 
in descending order as 1, > A 2 > ■ ■ ■ > A n and the corresponding eigenvectors are 
chosen to have unit length. This means ||r , ,|| = 1 for i = 1,2, ... ,n. Under these 
specifications, v { is called the first principle component, v 2 is the second principle 
component, and so on. More importantly, the matrix E can be decomposed into a 
product of three square matrices: 


E = PDP t , 


( 6 . 8 ) 
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where P = [v { ,v 2 ,...,v„\ and D = cliag(A t , 1 2 > . .. , 1„) is a diagonal matrix. In 
Splus, eigenvalues and eigenvectors are easily obtained with the subroutine 
“eigenO”. 

Theorem 6.6 IfZ ~ N(0, 1), then X = m + P\[dZ ~ N(/n, Z). 

Proof. Again, it suffices to check the mean and variance of X. For the mean, 

E[X] = m + E [P\[DZ\ = m. 


For the variance. 


Var[X] = Var [P\[dZ] = P\[d(\Im[Z]) [P\[d] t = PDP r = Z. 


□ 

Remarks VBA users may worry about matrix operations used in the aforemen- 
tioned algorithms. Fortunately, there are free downloads available on the Web 
that provide necessary subroutines under the platform of Excel. For instance, the 
PoPTools, from http://www.cse.csiro.au/poptools/index.htm, includes routines of 
Cholesky and eigenvalue decompositions. 


6.6 EXERCISES 

1. Using the inverse transform method to generate a random variable X with the 
probability mass function. 

(b) P(X = j) = ( n+ j-i)Cj(l — pyp", / = 0,1,2..., where n and p are given 
parameters. 

2. We simulate X, Y, Z from an inverse transform algorithm. Suppose that 
U ~ U(0, 1). Determine the distributions of the following random variables: 

(a) X= int(10£/(l - U)). 

(b) Y = int( 1 / U). 

(c) Z = (B — 3) 2 , B ~ Bin(5, 0.5). 

3. Determine the p.d.f. of 

(a) X = —10 log U + 5. 

(b) X = 2 tan (nU) + 10. 

(c) W = nU — int(nU). Show that it is independent of / = int(nU). (Hint: Show 
that P(W < w,I = i) = w/n.) 
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4 . Let X have probability mass function 


i 

1 

2 

3 

4 

5 

6 

7 

Pix = i ) 

0.3 

0.12 

0.09 

0.12 

0.1 

0.17 

0.1 


(a) Use the acceptance-rejection algorithm and simulate 1,000 data points from 
this distribution. You may use a discrete uniform as your g. 

(b) Plot out the histogram of your simulation. 

(c) What is the expected number of acceptance for this distribution? Does that 
match your simulation results? 

5. Suppose that we want to simulate from the density 

fix) = x + 1 /2, 0 < x < 1 . 

(a) Using the inverse transformation method, simulate 1,000 values from/. 

(b) Using the acceptance-rejection method, simulate another 1,000 values from 
/. Which algorithm is more efficient? 

6. Suppose that we want to simulate |Z|, where Z ~ N(0, 1). That is, the absolute 
value of a standard normal random variable. First note that the p.d.f. of |Z| is 
given by 

fix) = — — — e - * 2 / 2 , 0 < Jt < oo. 

'Jin 

Suppose you want to use the acceptance-rejection algorithm to simulate |Z|. Take 
g to be the exponential distribution, 

g(v) = e~ x , 0 < x < oo. 


(a) Determine the value c such that c = max — . 

g(x) 

(b) Use the acceptance-rejection method, simulate 1,000 values of |Z|. 

(c) Suppose that you want to recover Z from the simulated values of |Z| . One way 
to do it is to generate a random number U and set 

/ |Z| if U > 1/2, 

\ — |Z| if U < 1/2. 

Using this method, obtain 1,000 values of Z and plot its density. 

The solutions and/or additional exercises are available online at http://www.sta. 
cuhk.edu. hk/Book/SRMS/. 
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STANDARD SIMULATIONS IN RISK 
MANAGEMENT 


7.1 INTRODUCTION 

Risk management applications require simulation experiments. In this chapter, we 
introduce some standard simulation techniques and discuss their applications in risk 
management. 

7.2 SCENARIO ANALYSIS 

Scenario analysis of risk management refers to simulating possible scenarios to ana- 
lyze the risk of a decision and consequences. The ultimate goal of a scenario analysis 
may be to reach a decision, to verify a model, or to validate a certain conjecture. 

Suppose that a newspaper boy buys a newspaper from an agent for $4 each and 
sells it for $6. His problem is to decide how many newspapers to buy each morning. 
In other words, what would be a prudent purchasing strategy? 

To analyze the situation, he examines the sales record for the past 100 days given 
in Table 7.1. After reviewing the data in Table 7.1, he comes up with the following 
strategies: 

1. Each day, purchase the same number of papers sold the day before. 

2. Each day, purchase a fixed number of papers, say 23. 

To test each of these two strategies, one could simulate the scenarios using inverse 
transform. Firstly, convert the information in Table 7.1 into the empirical probability 
mass function (p.m.f.) (Table 7.2). 


Simulation Techniques in Financial Risk Management, Second Edition. Ngai Hang Chan and Hoi Ying Wong. 
© 2015 John Wiley & Sons, Inc. Published 2015 by John Wiley & Sons, Inc. 
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TABLE 7.1 Sales Record 


Number of Newspapers 

Days Occurring 

21 

15 

22 

20 

23 

30 

24 

21 

25 

14 


TABLE 7.2 Probability Mass Function 


Number of Newspapers 

p.m.f. 

Cumulative Distribution 

21 

0.15 

0.15 

22 

0.20 

0.35 

23 

0.30 

0.65 

24 

0.21 

0.86 

25 

0.14 

1.00 


TABLE 7.3 Policy Simulation and Evaluation 



u ~ U{ 0, 1) 

Number of Newspapers 

Profit of 1 

Profit of 2 

Day 1 

0.5828 

23 

$46 

$46 

Day 2 

0.0235 

21 

$34 

$34 

Day 3 

0.5155 

23 

$42 

$46 

Day 4 

0.3340 

22 

$40 

$40 

Day 5 

0.4329 

23 

$44 

$46 

Day 6 

0.2259 

22 

$40 

$40 

Day 7 

0.5798 

23 

$44 

$46 

Day 8 

0.7604 

24 

$46 

$46 

Day 9 

0.8298 

24 

$48 

$46 

Day 10 

0.6405 

23 

$42 

$46 



Total profit = 

$426 

$436 


Now simulate 10 future days and compare the two policies following the p.m.f. 
given in Table 7.2. The simulation draws a standard uniform random variables u. The 
demands of newspaper are generated according to where the random variables fall. 
For instance, if u = 0.17, which belongs to the range of 0.15-0.35, then the corre- 
sponding demand is 22. To have a fair comparison, assume that the newspaper boy 
orders 23 papers on Day 1. Table 7.3 lists the results of the simulation. The interval 
[0, 1] is partitioned according to the cumulative frequency in Table 7.2. According to 
Table 7.3, policy 2 is better than policy 1. One can repeat the simulation for many 
times to see if this phenomenon is consistent. 
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The newspaper boy example illustrates several important elements in scenario 
analysis. Decision makers identify possible scenarios on the basis of empirical data 
or experience. In this example, scenarios correspond to the daily demand of newspa- 
pers. Simulation is then developed to replicate future possibilities. We use the inverse 
transform with the empirical density function in this example. After generating sce- 
narios, a risk manager analyzes consequences corresponding to each scenario. If the 
first policy is adopted, then the number of newspapers purchased equals the number 
sold the previous day; otherwise, 23 papers are purchased. Finally, evaluation and 
comparison can be conducted using the simulated results. 

7.2.1 Value at Risk 

In finance, risk scenario analysis is usually conducted for evaluating value-at-risk 
(VaR), a widely adopted risk measure. 

Definition 7.1 VaR summarizes the worst loss of a portfolio over a target horizon 
with a given level of confidence. 

Statistically speaking, VaR describes the specified quantile or percentile of the pro- 
jected distribution of profits and losses over the target horizon. Let R t be the return of 
a portfolio for a horizon t. Then, the c% confidence VaR of the portfolio is measured 
through the expression: 


P(R, < -VaR) = (1 -c)% := a. 


(7.1) 


Hence, VaR is the negative of the a-th percentile of the probability distribution of 
profits and losses. The larger the VaR, the higher the risk of the portfolio. An advan- 
tage of VaR is that it allows the user to specify the confidence level to reflect individual 
risk-averseness. For more details, see Jorion (2000). 

VaR is indispensable for market risk analysis because it is the number that splits 
future possible asset returns into two scenarios: risky and nonrisky. Returns less than 
the negative of VaR belong to the class of risky scenario. Decision makers can evalu- 
ate their policies by examining consequences under the risky scenario. For instance, 
a bank may check if it maintains enough money for an extremely risky situation. 

A conventional way to measure VaR often assumes portfolio returns to follow 
a normal distribution. VaR obtained in this way is called normal VaR. A typical 
model is 


R, = p + oZ, Z ~ N(0, 1). 


(7.2) 


In such a parametric model, it is easy to derive that 


VaR ff (r) = -z a o - p, 


(7.3) 


where z a is the a-quantile of the standard normal distribution, p is the drift, and o is 
the standard deviation of the return R t over the horizon t. 
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Although one can prove Equation 7.3 mathematically, we would like to verify it 
by simulation. The algorithm is given as follows. 

Step 1: Generate n independent standard normal random variables, namely 
Zj ~ N(0, 1) i.i.d. (identical and independent distributed), j = 1,2, ... , n. 

Step 2: Set = p + oZj. 

Step 3: Rank {/?[, /? 2 » ’ ■ • ,R„} in ascending order as {R*,R*, ■ ■ • , R * } . 

Step 4: Set VaR = —R* k , where k = int(a X n). 

Example 7.1 Let p = 0.003, a = 0.23, a = 5%, and n = 10,000. Then, the 95% 
VaR corresponds to the 500th smallest return generated from the simulation. Our 
simulation shows that the VaR = 0.3783, which is close to the value, 0.3753, obtained 
by Equation 7.3. To see the code in Visual Basic for Applications, go to the Online 
Supplementary and download the file Chapter 7 Example 7.1. 

7.2.2 Heavy-Tailed Distribution 

In reality, returns of market prices may not follow a normal distribution but a 
heavy-tailed distribution. This means that the two tails of the empirical density decay 
less rapidly than the normal density. Because closed form solution for the VaR of a 
heavy-tailed distribution is not readily available, a feasible alternative is to generate 
random variables according to a heavy-tailed distribution. 

One commonly used form for heavy-tailed distribution is the generalized error dis- 
tribution (GED). The p.d.f. (probability distribution function) of GED with parameter 
£ is given by 


£exp f-i \z/7f 

f(z) = 


r(3/£) ’ 

where F(-) denotes the Gamma function. Figure 7.1 plots the p.d.f. of GED, and 
Figure 7.2 zooms in at the left-tail of the density function. It is seen that the smaller 
the | is, the heavier the left-tail of the density function will be. 

The key to simulate VaR is to generate random variables following the desired 
distribution. In this case, we apply the rejection method introduced in Chapter 6 using 
an exponential distribution for g. The algorithm goes as follows. 

Step 1: Generate Y ~ Exp(l). 

Step 2: Generate U ~ U( 0, 1). 

Step 3: If U < 2 f{Y)e Y /a, then Z = Y; else go to Step 1. 

Step 4: Generate V ~ U( 0, 1). If V < 1/2, then Z = —Y. 


22 1+1 /«r(i/£) 

2 - 2 /^ T ( 1/|)1 1/2 
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-4-3-2-101234 


z 

Figure 7.1 The shape of GED density function. 



Step 5: Repeat Steps 1-4 for n times to get { Z, , Z 2 , ... ,Z n j. 
Step 6: Set R, = n + r>Z r 

Step 7: Sort the returns in ascending order as {R*,R*, ... ,R*}. 
Step 8: Set VaR = — R* where k = int(a X n). 
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Remarks 

1. In Step 3, a is a constant no less than max v {2 f(y)e y }. 

2. As the exponential distribution is defined with a domain of positive real num- 
bers, Steps 1-3 of the algorithm generate positive GED. Step 4 converts a 
positive GED random variable into a GED random variable. 


7.2.3 Case Study: VaR of Dow Jones 

We demonstrate the use of GED- VaR by considering 10-year daily closing prices of 
Dow Jones Industrial Index (DJI) in the period of August 8, 1995, to August 7, 2004. 
Data downloaded from http://finance.yahoo.com consists of 2,265 prices. The prices 
are converted into 2,264 daily returns by the formula: 


Sample mean and standard deviation of the returns are 0.04% and 1.16% in a daily 
scale, respectively. From Equation 7.3, the 95% and 99% normal VaR from the sample 
are 1.87% and 2.66%, respectively. 

To access the quality of normal VaR, one has to test the normality assumption 
or, more precisely, the distributional assumption used in the VaR computation. Here, 
we introduce a simple but valuable tool, known as the quantile-quantile (QQ) plot. 
The idea is to plot the quantiles of the sample returns against the quantiles of the 
distribution used. If the returns truly follow the target distribution, then the graph 
should look similar to a straight line. For testing normality, the target distribution is 
the normal distribution. Systematic deviations from the line signal that the returns are 
not well described by the normal distribution. 

Figure 7.3 shows a QQ plot of our sample against the normal distribution. Large 
deviations are observed by the two tails of the empirical data. Specifically, the empir- 
ical quantile is less than the normal quantile in the left tail but larger than the normal 
quantile in the right tail. The deviations strongly suggest heavy-tailed distribution 
from the empirical data. 

We use GED to reduce the deviation from the QQ plot. Returns are first standard- 
ized by the sample mean and standard deviation as 


SR, = 


R r - 0.04% 
1.16% 


where SR, denotes the standardized return at time t. We conjecture that 
SR, ~ GED(|), identically and independently. The parameter £ is estimated 
from the SR using maximum likelihood estimation (MLE). Our estimation shows 
that £ = 1.21 (Appendix). Then, GED- VaR is estimated from the eight-step algo- 
rithm in Section 7.2.1, where the constant a is required in Step 3. The value of a can 
be deduced from the plot of 2 f(y)e y against y, where /(y) is the p.d.f. of GED(1.21). 
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Figure 7.4 Determine the maximum of 2f(y)e y graphically. 


Figure 7.4 shows that the maximum function value is bounded above by 1.2 so that 
we set a = 1.2. 

To simulate GED-VaR in Visual Basic for Applications, go to the Online Supple- 
mentary and download the file Chapter 7 Simulating GED-VaR. 
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The program estimates 95% and 99% VaR by generating 10,000 GED(1.21) 
random variables. For the confidence intervals, it repeats the process 1,000 times to 
get 1,000 VaR estimates. After arranging the simulated VaRs in ascending order, the 
95% two-tailed confidence interval (Cl) is the range between the 25th VaR and the 
975th VaR. 

To check the performance of GED-VaR, we use the QQ-plot of Figure 7.5 on the 
basis of one simulation. It is seen that deviations from the straight line have been 
substantially reduced. From this exercise, we see that GED(1.21) is appropriate for 
modeling the sample of Dow Jones returns. The average 95% VaR and 99% VaR from 
the 1,000 simulation are 1.87% and 3.02%, respectively. Therefore, 95% GED-VaR 
and 95% normal VaR give similar values, whereas 99% GED-VaR is 10% more than 
the 99% normal VaR. 

These findings may be useful for a risk manager. As normal VaR is commonly 
used in the financial industry, it is essential for a risk manager to understand the 
limitation of the normal VaR. The rationale of this empirical study is that normal 
VaR is a good estimate for potential losses of a portfolio under “normal, nonextreme” 
scenarios. However, it underestimates potential losses when “extreme events” hap- 
pen, especially for those happening with probability less than 1%. To measure VaR 
with higher confidence level, for example, 99% VaR, the risk manager may consider 
GED-VaR. For further discussion about extreme values, see Embrechts, Kliippelberg, 
and Mikosch (1997) and the themed volume of Finkenstadt and Rootzen (2004). 


7.3 STANDARD MONTE CARLO 

In the preceding chapters, we studied the idea of simulating random variables. One of 
the main reasons to simulate random variables is to estimate quantities such as E(X), 
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which is related to the evaluation of definite integrals. Suppose we have already gener- 
ated n values of a random variable X, it would be very natural to estimate the quantity 
9 = E(X) by X n = - £" = i X r We study some standard statistical techniques to assess 
the accuracy of such an estimate, which are based on the law of large numbers and the 
central limit theorem. Whenever we estimate quantities such as E(X) on the basis of 
standard applications of simulations, we refer these methods as standard Monte Carlo 
simulations. We study other more sophisticated simulation methods in later chapters. 


7.3.1 Mean, Variance, and Interval Estimation 


Suppose that X is a given random variable with mean 9 and variance a 2 . A natural 
way to evaluate 9 = E(X) using simulations is to generate random values X { , ... ,X n 
and calculate the quantity 





which is called the sample mean of {X 1; . . . , X n }. It is easy to see that 


E{X „ ) = E(X) = 9, unbiasedness property, 

Var(Z„)=— . 

n 


(7.5) 

(7.6) 


To assess the accuracy of X n as an estimate of 9, we rely on two important results. 
The first one is the law of large numbers, which asserts that as the number of simula- 
tions n gets bigger, the closer is X n to 9, see, for example, Casella and Berger (2001). 
Specifically, 

Theorem 7.1 Let X x , ... ,X n be i.i.d. random variables with mean 9 and variance 
a 2 . Then for any given e > 0, 

P( \X n — 9\ > e) -»• 0 as n -*■ oo. 

This result is sometimes written as X n 9 in probability. 

The second one is the central limit theorem, which asserts that as n tends to infinity, 
the distribution of the random variable X n behaves as a normal distribution approxi- 
mately. 

Theorem 7.2 Let X l , ... , X n be i.i.d. random variables with mean 9 and variance 
a 2 > 0. Then as n tends to infinity 


9) < z) 


®(z). 


where <l?(z) denotes the c.d.fi (cumulative distribution function) of a standard normal 
distribution evaluated at the point z. 
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A equivalent definition of this result is that the random variable \fn(X n — 9) /a con- 
verges in distribution to Z, written as 


\fn 


(X„ - 0) 

a 


~*c 


Z, 


where Z ~ N(0, 1). The proof of these two results can be found in standard text books 
in probability; see Billingsley (1999) for example. One immediate application of the 
central limit theorem is to construct approximate confidence intervals for 9. Accord- 
ing to Theorem 7.2, 


P(\X n -9\> -Z-c) ~ P(\Z\ > c) = 2(1 - O(c)). 
V” 


As a result, if we let c = 1.96, then the probability of X n differs from 9 by more than 
1.96(7 / \Jn would be approximately equal to 0.05. In other words, we are relatively 
confident (95%) that our estimate is within two standard errors (1.96 a / \fn) from 
9. To make use of this result, we have to have knowledge about the value a , which 
is usually unavailable. A simple fix is to estimate it from the simulated values. The 
sample variance, which is defined as 


S 2 =- L ri(X,-X n ) 2 , 

n — 1 ~ 

i = l 


constitutes an estimate of a 2 . It can be easily shown that 


E(S 2 ) = a 2 , unbiasedness property, (7.7) 

Sj +l = ( 1 - 1 /f)Sj + {j+ 1)(A /+I - Xj) 2 . (7.8) 

One frequently asked question in simulations is that after simulating X and evalu- 
ating X n , when should we stop? The answer to this question is given by the following 
scheme: 


1 . Choose an appropriate value d for the standard deviation of the estimation. That 
is, d represents the margin of error we can tolerate using simulations. 

2. Generate at least 100 values of X. 

3. Continue generating X and stopping when we have k values of X such that 

S/V~k<d. 

4. The desired estimate is given by X k . 

Finally, we can form an interval estimation for 9 by using the notion of confidence 
intervals. 
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Definition 7.2 IfX n = x, S = s, then the interval 



,x + z 



is an approximate 100(1 — a)% confidence interval for 9. 

In particular, when a = 0.05, z , t / 2 = 1-96 and (x ± 1.96 s/yfn) is an approximate 
95% confidence interval for 9 and thus giving rise to the rule of “two sigma.” 


7.3.2 Simulating Option Prices 

To illustrate the ideas of standard simulations in risk management, consider first sim- 
ulating stock prices. Let S denote the price of a stock. Recall that we usually assume 
that S follows a geometric Brownian motion 

dS = i uS dt + <tS dW. 


Equivalently, 

d log S = vdt + a dW, 

where v = p — o 2 / 2. Using the last equation and letting c to denote a standard normal 
random variable, we can generate S according to the formula 

S(t -I- dt) = S(t) exp(v dt + a e\fdt). 

In particular, S T = S 0 e Xr , where X T = vT + aW T ~ N(vT, a 2 T) (Section 4.3), so we 
have 

S(T) = 5(0) exp(vT + ae\fr). (7.9) 

Notice that according to the risk neutral valuation principle, we usually take p = r, 
the risk-free rate. 

Example 7.2 Let S 0 = 10, p = r = 0.03, a = 0.4, and dt = 1 /52. We want to sim- 
ulate weekly prices of the stock 5), /' = 1 , . . . , 52 for a 1-year period. Then v = p — 
a 2 /2 = —0.05 and the results are given in Table 7.4. To see the code in Visual Basic 
for Applications, go to the Online Supplementary and download the file Chapter 7 
Example 7.2 Simulating weekly prices of the stock. 

Suppose that we want to calculate the price of a European call option maturing in 
1 year with strike price K = 12. We can use the Black-Scholes formula to obtain the 
call price C as 


C(S, t) = S^dO - Ke^ r - ,) d)(d 2 ), 
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TABLE 7.4 Simulated Prices of 
the First and the Last 10 Weeks 


Week 

Price 

0 

10.000000 

1 

10.38419 

2 

10.37402 

3 

10.67406 

4 

11.65342 

5 

11.89871 

6 

11.28296 

7 

11.15327 

8 

10.33483 

9 

11.16090 

10 

12.14546 

43 

14.39009 

44 

13.78038 

45 

14.01125 

46 

12.72393 

47 

13.44627 

48 

13.05377 

49 

12.00424 

50 

12.74416 

51 

12.16204 

52 

12.15517 


where d, = — , 

ryy/T-t 

ing the values of r 


(log( S/K) + (r + c 1 2 /2)(T - t)) and d 2 = d l - 
0.03, K = 12, T = 1, t = 0, a = 0.4, and S 0 


a \/t — t. Substitut- 
= 10, we get 


d ] = — - — (log( 10/ 12) + (0.03 + 0.08X1)) = -0.1808, d 2 = d { - 0.4 = -0.5808. 
0.4\/T 


Using the Splus command pnorm ( z ) to evaluate <!>(<;), we get ) = 0.4283 and 
<t> (d 2 ) = 0.2807. Hence, 


C = 10(0.4283) - 12e -0 03 (0.2807) = 1.013918. 


On the other hand, we can evaluate C = e ' / E(.S , - / - — K) + . 

Example 7.3 The price of the European call option can now be computed using 
simulations. 


1. First generate n independent values of .S', (7’j, ... ,S n (T) according to Equation 

7.9. 
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2. Compute simulated discounted call prices C, = e rT max { (5,(7’) — K), 0},/ = 
1 , ,n. 

3. Compute C = - Yj'Li Q- C is an estimate of the discounted payoff E(.S' 7 — K) + . 

4. Construct a 95% confidence interval for C from 

C± 1.96 S/\fn, 


where 


A 


n 

i= 1 


C) 2 


is the sample standard deviation of the simulated call prices C ; . 


To simulate 100 paths of stock price to compute call option price and its confi- 
dence interval in Visual Basic for Applications, go to the Online Supplementary and 
download the file Chapter 7 Example 7.3 Computing European call option price by 
simulation. 

Outputs of the simulated C,s are given in Table 7.5. The result of a 100-path simu- 
lation shows that the 95% confidence interval for C is (0.37, 1.83). Figure 7.6 shows 
that when the number of runs increases, the value of C converges to the limit of 1.01. 


TABLE 7.5 The Discounted Call 
Prices for the First 20 Paths 


Path 

c, 

1 

0.0000000 

2 

0.0000000 

3 

0.0000000 

4 

5.9331955 

5 

1.1971242 

6 

0.0000000 

7 

2.2395878 

8 

0.0000000 

9 

0.0000000 

10 

4.0065595 

11 

0.0000000 

12 

1.3006804 

13 

0.0000000 

14 

0.0000000 

15 

0.0000000 

16 

0.0000000 

17 

6.0970236 

18 

0.0000000 

19 

0.0000000 

20 

0.1768191 
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1.15 - 


<d 1.10 - 



to 1-05 - 


Q) 

Q) 


O 

K i.oo- 


(0 

° 0.95 - 


0.90 - 


0 


5,000 


10,000 

No. of simulations 


15,000 


Figure 7.6 Simulations of the call price against the size. 


7.3.3 Simulating Option Delta 

In risk management, hedging an option is sometimes more important than valuing 
the option. When a bank issues structured financial products to enhance sales, the 
embedded option risk would be of great concern. Hedging is a useful device to man- 
age such a risk. For a standard call option, the hedge ratio refers to the delta of the 
option, the partial derivative of the option price with respect to the underlying asset 
price. Under the Black-Scholes assumption, the delta of a call is defined as 


delta = — = OWi). 


(7.10) 


We use simulation to calculate the hedge ratio, delta, for general European options. 

The risk-neutral valuation asserts that an option with payoff F(S T ) can be valued 
as e _,r E[F’(5' r )|5 0 = S]. Therefore, delta equals 


delta = e - r7 '-|E[F(5 r )|5 0 = S]. 


(7.11) 


In order to compute delta under the Black-Scholes dynamics, the following 
theorem is established. 

Theorem 7.3 The delta of a European option with payoff F(S T ) is given by 



(7.12) 


where W T is the standard Brownian motion driving S T . 
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Proof. Ignoring the discount factor, the definition of delta in Equation 7.11 is 


-f 

dS J _ 0 


, 1 dd>(x\ log 5) 

F(e x )<j)(x | log S) dx = / F(e x ) ± „ dx, 


f 

J — C 


where 


f(x\y ) : 


1 


cry/ 2 nT 

Standard differentiation shows that 

d(j)(x\y) 


exp 


S d log S 
(x — y — vT) 2 


2 a 2 T 


dy 


4>{x\y) 


x-y-vT 
~c 2 T 


Hence, we have 


delta = e 


-rT 


f 


F(e*) 


x - log S - vT 
S(J 2 T 


</>(x | log S) dx. 


Recall that x = log S T , 

x — log S - vT = log S T - log S — vT = aW T . 


This completes the proof. □ 

Theorem 7.3 enables us to simulate option delta (or even gamma) as follows. 

Step 1: Generate Z x , Z 2 , ... ,Z n ~ N(0, 1) i.i.d. 

Step 2: Set Y ] = F (se^ 2 l2)T+aZ i^ ^=. 

Step 3: Set delta = ^ Y,'j=\ Yj- 

The theorem can be extended to the case of path-dependent options. However, the 
derivation requires some knowledge of Malliavin calculus, which is beyond of the 
scope of the book. For details of this generalization, we refer to the article of Fournie 
et al. (1999, 2000). 

Example 7.4 The current price is $10, interest rate 5%, and volatility 40%. Simu- 
late the price and delta of a call option with strike price $12 and maturity 1 year by 
generating 10,000 terminal asset prices. 

An algorithm can be constructed as follows: 


Step 1 : Generate 10,000 terminal asset prices by the formula 


S J T = S 0 exp 


(r - a 2 /2)T + <r V7z ; ] , Zj ~ N(0, 1). 
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Step 2: For /' = 1 to 10,000, Compute 


Cj = max(5 , r — K) * exp(-rT) and Del ; = Cj * Z ; /(t7\/rS’ 0 ). 

Step 3: Compute call price = mean(C ; ) and delta = meanlDe^). 

To simulate call option price and its delta in Visual Basic for Applications, go to 
the Online Supplementary and download the file Chapter 7 Example 7.4 Simulating 
price and delta of a call option. 

With a CPU time of 0.9 s, our simulation finds that the call price is 1.06 and the 
delta is 0.44. The Black-Scholes call price and the delta are 1.08 and 0.448, respec- 
tively. This demonstrates the efficiency and accuracy of the simulation algorithm. 

One thing we have to stress is that Theorem 7.3 is very useful for simulating deltas 
of single asset European options, with arbitrary payoff F(S T ). However, it may not 
be applicable for path-dependent options and multiasset options. Therefore, we intro- 
duce alternative methods in later chapters. 


7.4 EXERCISES 

1. Write the VBA code for the newspaper boy example of Section 7.2. 

2. Suppose that the asset return follows the /-distribution with two degrees of free- 
dom. Write a VBA code to simulate the 95% confidence VaR with parameters 
given in Example 7.1. Compare your result with the one obtained by normal VaR. 

3. Implement the rejection method for generating GED when v = 1.4. 

4. Verify Equations 7.5 and 7.6. 

5. Verify Equations 7.7 and 7.8. 

6. Let S 0 = 100, p = r = 0.05, a = 0.3. Use the geometric Brownian motion method 
to simulate 20 daily prices of the stock S t ,i= 1, . . . , 20. 

(a) Suppose that you want to determine the price of a European put option matur- 
ing in 20 days with a strike price K = 100. Use simulation techniques to 
estimate this price. 

(b) Compare your result with the one obtained from the Black-Scholes formula. 
Are they similar? 

7. The gamma of an option is defined as 

d(delta) 

dS 


(a) What is the financial interpretation of the gamma? 
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(b) By modifying the proof of Theorem 7.3, show that 


gamma = e rT E F(S T ) 


S 2 gT 


1 



(c) Construct and implement a simulation algorithm to compute the call option 
gamma with a Splus code or VBA code. 

(d) Suppose that S = 10, K = 12, r = 0.1, tr = 0.3, and T = 0.8. Compare your 
simulation result with the closed form solution: 



The solutions and/or additional exercises are available online at http://www.sta. 
cuhk.edu. hk/Book/SRMS/. 


7.5 APPENDIX 

The data comprise 2,264 daily rates of returns. These data are transformed into stan- 
dardized returns by using the sample mean and standard derivation. We assume that 
standardized returns follow a GED distribution with parameter Our goal is to esti- 
mate £. The density function of the GED distribution is given in Equation 7.4. Hence, 
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Figure 7.7 The log likelihood against 
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the likelihood function is 


2^64£exp(-i| Zjxf) 

L(Z)= l 


i= 1 


k = 


zi2i+i/^r(i/|) 
2- z /«n l/D] 1/2 


r(3/§) 


where Z 1; ... ,Z 2 i 64 are standardized returns. Instead of deriving the MLE theoret- 
ically, we search the maximum point of the likelihood function with a numerical 
method. To confine the target point in a small interval, we plot the likelihood func- 
tion against the parameter In Figure 7.7, we recognize that a unique maximum 
appears for t, e (1, 1.3). The plot is given after this section. We then use the bisec- 
tion method to search for the solution. Specifically, we compare L(l) and U 1 .3) and 
discard the smaller one. The next step compares the remaining one with L(1.15), the 
functional value at the mid-point of 1 and 1.3. We discard the point with a smaller 
value in L and repeat the procedure until a sufficiently accurate solution is obtained. 
Ultimately, £ = 1.21, which has been input to generate GED-VaR in Section 7.2.3. 
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8.1 INTRODUCTION 


In standard Monte Carlo, we estimate the unknown quantity 6 = EX by generating 
random numbers X l ,...,X n and use X n to estimate 0. Recall that in the preceding 
chapter, the standard error for X n is a / \fn, where er 2 is the variance of X. There 
are two sources of contributions to the standard error of estimation. One is the factor 
1 / \fn, which is intrinsic to the Monte Carlo method, and not much can be done about 
it. The other one is the standard error a of the output X, which by some techniques, 
can be improved upon. There are usually four standard methods to reduce a\ 

1 . Antithetic variables 

2. Control variates 

3. Stratification 

4. Importance sampling 

We discuss each of these methods in the subsequent sections. 


8.2 ANTITHETIC VARIABLES 

The idea of antithetic variables can best be illustrated by considering a special 
example. Suppose that we want to estimate 6 = EX by generating two outputs Xj 


Simulation Techniques in Financial Risk Management, Second Edition. Ngai Hang Chan and Hoi Ying Wong. 
©2015 John Wiley & Sons, Inc. Published 2015 by John Wiley & Sons, Inc. 
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and X 2 such that EX] = EX 2 = 0 and VarX, = VarX 2 = a 2 . Then 


Var(|(X] + X 2 )) = i(VarX] + VarX 2 + 2Cov(X],X 2 )) 



Note that when X t and X 2 are independent, then Var((Xj + X 2 )/2) = a 1 /l. Thus, the 
aforementioned inequality asserts that if X] and X 9 are negatively correlated, then the 
variance of the mean of the two would be less than the case when X { and X 2 were 
independent. 

How do we generate negatively correlated random numbers? Suppose that we sim- 
ulate U x , ... , U m , which are uniform random numbers. Then Vj = 1 — U l , ... ,V m = 
1 — U m would also be uniform random numbers with the property that (U n Vj) being 
negatively correlated (exercise). If X, = h(U l , ... , U m ). thenX 2 = h(V l , ... , V m )must 
have the same distribution as X , . It turns out that if h is a monotone function (either 
increasing or decreasing) in each of its arguments, then X { and X 2 are negatively cor- 
related. This result is proved later at the end of this section. Thus, after generating 
{/[,..., U m to compute X { , instead of generating another new independent set of Us 
to compute X 2 , we compute X 2 by 


X, = h(V u ...,V m ) = h(l - U x ,...,\ - U m ). 


Accordingly, (Xj + X 9 )/2 should have smaller variance. 

In general, we may generate X, = F _1 ({/,) using the inverse transform method. Let 
Y l = As F is monotone, so is F~ l and, hence, X, and K ( will be negatively 

correlated. Both X I; ... ,X n and Y x , ... , Y n generated in this way are i.i.d. (identical 
and independent distributed) sequences with c.d.f. (cumulative distribution function) 
F , but negatively correlated. 

Definition 8.1 The Y t sequence is called the sequence of antithetic variables. 

For normal distributions, generating antithetic variables is straightforward. Sup- 
pose that X ; ~ N(/y, (7 2 ), then Y i = 2 ft — X t also has a normal distribution with mean 
ft and variance a 2 and X ; and Y j are negatively correlated. 

More generally, if we want to compute E (H(X)) for some function //. standard 
Monte Carlo suggests using - Ya=i H(X t ). Then an antithetic estimator of E{H(X)) is 


n 



^(mXi) + HlYf), 


where Y t is a sequence of antithetic variables. To see how variance reduc- 
tion is achieved by using this antithetic estimator, let Var (H(X)) = a 1 and 
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Corr (H(X),H(Y)) = p. Consider 

n 

VarC Han) = ^{WsrH(X t ) + Var//(K,j + 2Cov(//(V,), IKY,))} 

i= 1 

= — (2no 2 + 2npcj 2 ) 

4 n 2 

2 

= 57 u + ' ,) - 

Note that when H(X) and H(Y) are uncorrelated (p = 0), then the variance would be 
reduced by a factor of 2, which is equivalent to doubling the simulation size. On the 
other hand, if p = — 1, then the variance would be reduced to zero. As long as p is 
negative, some form of variance reduction can be achieved. An obvious question is 
that in view of this observation, why not choose Y so that p = — 1? Such 7s may be 
difficult to construct, as p represents the correlation between H(X) and IK Y). In the 
case H(X ) = X, then H AN reduces to a constant, which is the perfect scenario. In view 
of these caveats, we usually choose the antithetic variables Y so that p is negative, not 
necessarily —1. When H is linear, such as the case H{X ) = X, the antithetic variable 
works best. In general, the more linear the H is, the more effective the antithetic 
variable will be. 

Example 8.1 Let 9 = e x dx. 

We know that 6 = e — 1 . Consider the antithetic variable V = 1 — U. Recall that the 
moment-generating function of U equals E(e rU ) = (e‘ — I )/ 1 . Now 

Cov(e u , e v ) = E(e u e v ) - E(e u )E(e v ) 

= E(e u e l ~ u ) - E(e u )E(e l - u ) 

= e-(e-l ) 2 = -0.2342. 


Furthermore, 

Var(e u ) = E(e 2u ) - (E(e u )) 2 = ( e 2 - l)/2 - (e - l) 2 = 0.242. 

Thus, for U l and U 2 to be independent uniform (0, 1) random variables, 

VartCe^i + e U2 )/2] = Vm{e u )/2 = 0.121. 

But 

Vm[(e u + e v )/2\ = Var(e u )/2 + Cov(e u , e v )/2 = 0.121 - 0.2342/2 = 0.0039, 

achieving a substantial variance reduction of 96.7%. 

We are now ready to justify the argument used in advocating antithetic variables. 
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Theorem 8.1 Let X : , ... ,X n be independent, then for any increasing functions f 
and g ofn variables, 

E(f(X)g(X)) > Ef(X)Eg(X), 


where X = (X { , ... ,X n ). 

Proof. By mathematical induction. Consider n = 1 , then 

(fix) -f(y))(g(x) - g(y )) >0, for all x and y, 

as both factors are either non-negative (x > y ) or nonpositive (x < y). Thus, for any 
random variables X and Y, 

(f(X) -f(Y))( g (X) - g(Y)) > 0 implying E((f(X) - f(Y))(g(X) - g(Y))) > 0. 

In other words, 

E (f(X)g(X)) + E (f(Y)g(Y)) > E (f(X)g(Y)) + E (f(Y)g(X)). 

If X and Y are independent and identically distributed, then 

E(f(X)g(X)) = E(/(T)g(T)) 

and 

E (f(X)g(Y)) = E (f(Y)g(X)) = E(/(T))E(g(Z)) = E(f(X))E{g(X)) 

so that 

E (f(X)g(X)) > E(f(X))Etg{X)), 

proving the result for the case n = 1 . Assume the result for n — 1 . Suppose X u ... , X n 
are independent and let / and g be increasing functions. Then 

E(f(X)g(X)\X„ = x n ) = E(f(X„ ... ,Z„_ 1 ,x„)g(X 1 , ... = x„) 

= E (f(X, , . . . , , xJgiX, ,...,A„_,, x n )) 

(because of independence) 

> E (fiX , , . . . , X n _ j , x„))E(g(A 1 ,...,X n _ v x n )) 

(by induction hypothesis) 

= E(f(X)\X n =x n )E(g(X)\X n =x n ). 


Hence, 


E(f(X)g(X)\X n ) > E(f(X)\X n )E(g(X)\X n ). 


ANTITHETIC VARIABLES 


111 


On taking expectation on both sides of this equation, we have 
E (f(X)g(X)) > E[E(/(Z)|Z„)E(g(Z)|X„)]. 

Observe that E(f(X)\X n ) and E(g(Z)|X n ) are increasing functions of X n so that by the 
result of n = 1 , we have 

E[E(/(X)|X„)E(g(X)|X„)] > E[E(/(X|X„))]E[E(g(X|XJ)] 

= E(f(X))E(g(X)). 

This completes the proof for the case of n. □ 

Corollary 8.1 If M X | , . . . , X n ) is a monotone function of each of its arguments, then 
for a set U\, ... ,U n of independent random numbers, 

Cov \MU\, ... , [/„),/?( 1 - U u ... , 1 - {/„)] < 0. 

Proof. Without loss of generality, by redefining h, we may assume that h is increasing 
in its first r arguments and decreasing in its remaining n — r arguments. Let 

f(x l ,...,x n ) = h(x l ,...,x r , \ -x r+1 , ... , 1 -x n ), 
g(x h ... ,x„) = -h( 1 -x h ... , 1 - x r ,x r+l , ... ,x„). 

It follows that both/ and g are increasing functions. By the preceding theorem, 

Cov\f(U l ,...,U n ),g(U l ,...,U n )]>0. 


That is. 


Cov[ h{U v ...,U r ,V r+1 ,..., V n ), MV„ ...,V r ,U r+l ,..., £/„)] < 0, (8.1) 

where V t = 1 — f/,. Observe that as {h{U { , ... , U n ),h(V l , ... , V n )) has the same joint 
distribution as {h{Ui , ..., U r , V r+ i, ..., Vf),h(Vi, ..., V r , U r+l , ..., {/„)), it follows 
from Equation 8.1 that 

Cov[KU 1 ,...,U n ),KV l ,...,V n )\<0, 


proving the corollary. □ 

When is antithetic variable effective? The following are some guidelines: 

• Antithetic variables will result in a lower variance estimate than independent 
simulations only if the values computed from a path and its antithetic variables 
are negatively correlated. 
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• If Z/ is monotone in each of its arguments, then antithetic variables reduce vari- 
ance in estimating E (H(Z l , ... , Z n )). 

• If H is linear, then an antithetic estimate of E(H(Z l , ... , Z n )) has zero variance. 

• If H is symmetric, that is, H{—Z) = H(Z), then an antithetic estimate of sample 
size 2 n has the same variance as an independent sample of size n. 

Example 8.2 To illustrate some of these points, consider the simulations of payoff 
of options using antithetic variables. The function H in this case maps 

z — >• max{0, S 0 exp([r - a 1 /2]T + o\/Tz) - K}. 

In Figure 8.1, the vertical axis is the payoff and the horizontal axis is the value 
of z, the input standard normal. All cases have r = 0.05%, K = 50, and T = 0.5. 
The top three cases have o = 0.3 and S 0 =40, 50, and 60; the second three cases 
have 5 0 = 50 and o = 0.10,0.20,0.30. The top three graphs correspond to the 
function H for options that are out-of-money (S a = 40), at-the-money (S 0 = 50), and 
in-the-money (S 0 = 60), respectively; the bottom three graphs correspond to low, 
intermediate, and high volatility for an at-the-money option. (The precise parameter 
values are given in the caption of the figure.) As one would expect, increasing 
moneyness and decreasing volatility both increase the degree of linearity. For the 
values indicated in the figure, we find numerically that antithetics reduce variance by 
14%, 42%, and 80% in the top three cases and by 65%, 49%, and 42% in the bottom 
three, respectively. Clearly, the more linear the function H is, the more effective the 
antithetic variable technique will be. 

Example 8.3 Figure 8.2 plots the payoff of \S T — K| on a straddle as a function of 
Z- The parameter values are given in the caption. The graph shows a high degree of 
symmetry around zero, suggesting that antithetic variables may not be as effective as 
in the other cases. Numerical results here indicate that an antithetic estimate based 
on m pairs of antithetic variables has higher variance than an estimate based on 2m 
independent samples. 

Please see the online material for the VBA codes. 


8.3 STRATIFIED SAMPLING 

The idea of stratification is often used in sample surveys (Barnett, 1991). The idea lies 
in the observation that the population may be heterogeneous and consists of various 
homogeneous subgroups (such as gender, race, and social-economic status). If we 
wish to learn about the whole population (such as whether people in Hong Kong 
would like to have universal suffrage in 2007), we can take a random sample from 
the whole population to estimate that quantity. On the other hand, it would be more 
efficient to take small samples from each subgroup and combine the estimates in each 
subgroup according to the fraction of the population that subgroup represents. As 


H(Z) H(Z) 








z z Z 

Figure 8.1 Dlustration of payoffs for antithetic comparisons. 
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Z 

Figure 8.2 Payoff on a straddle as a function of input normal Z based on the parameters 
S 0 = K = 50, a = 0.30, T= 1, and r = 0.05. 


we can learn about the opinion of a homogeneous subgroup with a relatively small 
sample size, this stratified sampling procedure would be more efficient. 

In general, if we want to estimate EX, where X depends on a random variable 
S that takes on one of the values in { 1, ... ,k] with known probabilities, then the 
technique of stratification runs into k groups, with the ;th group having S = i. 
Let Xj be the average values of X in those runs having S = i, and then estimate 
EX = III E(X|S = i)PiS = 0 by 


2 

7=1 


XfP(S = i ). 


This is known as stratified sampling. 

r 1 

To illustrate this idea, suppose that we want to estimate E (g(U)) = J 0 g(x)dx. Con- 
sider two estimators on the basis of a sample of 2 n runs. The first one is the standard 
method, 

2 n 

An i=i 


Note that E(g) = E(g(t/)) and 


Var(g) = 


1 

4 n 2 


2 n 

£Var (*(£/,.)) 
1=1 
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On the other hand, we can write 


e ( 8 m 


r 1/2 r 1 

= / g(x)dx + / g(x)dx. 
Jo J 1/2 


Instead of selecting Us from [0, 1 ], we can select the first nUs from [0, 1/2] and the 
remaining n Us from [1/2, 1] to construct a new estimator 


8s = 


2 n 


2 n 


'ZgiUj 2)+ X 8((U i+ l)/2) 


i= 1 


z=n+ 1 


It can be easily seen that if 17 ~ 1/(0, 1), then V = a + (b — a)U is distributed as uni- 
form ( a,b ). In particular, C7/2 ~ 17(0, 1/2) and ( U + l)/2 ~ U( 1/2, 1). To compute 
the variance of the new estimator, consider 


2 n 


Var(g s ) = A S Z Var(g(t/ ; /2)) + J Var (*((£/; + l)/2)) J- . 


An 2 ... ... 

V. 1=1 j=n+l 

Direct computations show that if Z7,- ~ 17(0, 1), then 

• 1/2 


Var I g 


Var ( g 

U: + 1 


«>•/ 

< 


g~{x)dx — 4m j, 


g 2 (x) dh — 4m?, 


where 


/•1/2 /-l 

m, = y 0 g(x) dx and m 2 = /j^ 2 g(x) c/x. Now 


Var ( g ^ ) ) + Var ( g 


U ; + 1 


)= 2 /‘ 


g“(x) </x - 4(m? + m"). 


Consequently, 


Note that 


Var(g,) = 


2n 


g 2 (x) dx — 2(m 2 + m 2 ) 


}- 


(/«[ + m 2 ) 2 + (m ] — m 2 ) 2 = 2(m 2 + m 2 ). 


Var(g,)=f 

In 


g 2 (x)dx — (m l + m 2 )“ — (m l — m 2 )' 


) 


: Var(g) - — (mj - m 2 ) 2 . 
zn 


Therefore, 
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Because this second term is always non-negative, stratification reduces the variance 
by an amount of this second term. The bigger the difference in m { and m 2 , the greater 
the reduction in variance. In general, if more strata are introduced, more reduction 
will be achieved. One can generalize this result to the multistrata case, but we omit 
the mathematical details here. 

Example 8.4 Consider again 0 = E(e u ) = f Q l e x dx. 

Recall that by standard Monte Carlo with n = 2, 

g = i (e u l+e u 2l 

and Var(g) = 0.121. On the other hand, using stratification, we have 

g s = \{e u ^ + ^2+0/2), 

and Var(g s )= Var(g) —(rn l —m 2 ) 2 /2, where e x dx = e 1 / 2 — 1 and 

m 2 = e x dx = e — e 1 / 2 . Thus, 

Var(g s ) = 0.121 - (2e 1/2 - e - l) 2 /2 = 0.0325, 

resulting a variance reduction of 73.13%. 

Stratified sampling is also very useful to draw random samples from designated 
ranges. For example, if we want to sample Z 1; ... ,Z 100 from a standard normal dis- 
tribution, the standard technique would partition the whole real line (— oo, oo) into a 
number of bins and sample Zs from these bins randomly. In such a case, it is inevitable 
that some bins may have more samples, while other bins, particularly those near the 
tails, may have no sample at all. Therefore, a random sample drawn this way would 
under-represent the tails. Although this may not be a serious issue in general, it may 
have severe effect when the tail is the quantity of interest, such as the case in the sim- 
ulation of VaR. To ensure that the bins are regularly represented, we may generate 
the Zs as follows. Let 


v,= T^o (c/ ' + a_1)) ’ /=1 ’-’ 100 ’ 

where f/ ; ~ U( 0, 1) i.i.d.. By the property of uniform distribution, V ( - ~ [/(y^. 

Now let Z, = <t> _1 (V ( ). Then Z ; falls between the i — 1 and i percentiles of the standard 
normal distribution. For example, if i = 1, then V = [7/100 ~ [7(0, 1/100) so that 
Z = 4> -1 (V) falls between <F -1 (0) = -oo and <t> — 1 (0.01), that is, the 0th and the 1st 
percentile of a standard normal distribution. 

This method gives equal weight to each of the 100 equiprobable strata. Of course, 
the number 100 can be replaced by any number that is desirable. The price we pay 
in stratification is the loss of independence of the Zs. This complicates statistical 
inference for simulation results. 
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Example 8.5 Av an illustration of stratification, consider simulating standard nor- 
mal random numbers via standard method and stratification method, respectively. 
As can be clearly seen from Figures 8.3 and 8.4, stratified sampling generates sam- 
ples much more uniformly over the range than standard Monte Carlo. Please see the 
online material for the VBA codes. 

Example 8.6 Av a second illustration of stratification, consider the simulation of a 
European call option of Example 7.2 again. 


30 -| 
25 - 
20 - 
15 - 
10 - 
5 - 
0 - 


, I.. .. I.ll 


-2 



0 

*1 


llil ill.. . . 


Figure 8.3 Simulations of 500 standard normal random numbers by standard Monte Carlo. 



X 2 


Figure 8.4 Simulations of 500 standard normal random numbers by stratified sampling. 
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In Example 7.2, we simulate the terminal prices SfT), ... ,S n (T) according to 
Equation 5.1 and then compute the estimate as 

_ r T n 

C = — y ma x{S:(T) - K, 0}. 

n ti 

In this standard simulation, the random normals are samples arbitrarily over the whole 
real line. We can improve the efficiency by introducing stratification. 

1. Partition (— 00 , 00 ) into B strata or bins. 

2. Set Vj = j s ( U , 4- (i — 1)), i = 0, ... ,B and generate the desired number of ran- 
dom samples (N B , say) of Vs in the ;th bin. 

3. Apply <J> -1 (P ; ) to get the desired normal random numbers from each bin and 
calculate C, from each bin. 

4. Average the C, over the total number of bins to get an overall estimate C. 

5. Calculate the standard error as in the previous cases. 

This numerical example uses S 0 = 1 0, K = 12, r = 0.03, <7 = 0.40, and T = 1. The 
theoretical Black-Scholes price is 1.0139. We simulate the European option price 
for different bin sizes with N B X B = 1, 000 in all cases. The effect of stratification 
increases as we increase the number of bins. The results are shown in Table 8.1. Please 
see the online material for the VBA codes. 

Regular stratification puts equal weight on each of the B bins. Such an allocation 
may not be ideal as one would like to have sample sizes directly related to the vari- 
ability of the target function over that bin. To illustrate this point, consider the payoff 
of a European call option again. 

Example 8.7 Stratified sampling for a European call with the same parameter val- 
ues as in Example 8.6. 


TABLE 8.1 Effects of Stratification for Simulated 
Option Prices with Different Bin Sizes 


Bins (B) 

Nb 

Mean (C) 

Std. Err. 

1 

1000 

0.9744 

0.0758 

2 

500 

1.0503 

0.0736 

5 

200 

1.0375 

0.0505 

10 

100 

0.9960 

0.0389 

20 

50 

0.9874 

0.0229 

50 

20 

1.0168 

0.0146 

100 

10 

0.9957 

0.0092 

200 

5 

1.0208 

0.0094 

500 

2 

1.0151 

0.0062 

1000 

1 

1.0091 

NA 
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We know that if S T < K, then the payoff of the call is zero. Recall that 


s T = v [, - a2 / 2]r+a ^ z 


Therefore, S T < K if .S 0 e |r “' y2/2|r+ ' yv/?z < K. That is, 

Z < [log(A'/5 0 ) - (r- a 2 /2)T]/(a\fr) := L. 

Every simulated Z < L is being wasted as it just returns the value 0. We should only 
be concentrating on the interval [L, oo). How can we achieve this goal? 


1. Find out the c.d.f. of a normal distribution Y restricted on [L, oo). It can be 
shown that Y has c.d.f. 

®G0 - ®(£) 


F(y) 


1 - 0>(L) 


2. Use the inverse transform method to generate Y. Consider the inverse transfor- 
mation of F, that is, solve for y such that y = Writing it out, we have 

x = F(y ) = so that 


y = 0- 1 (x(l-0(L)) + 0(L)). 

Now generate U from uniform (0, 1) and evaluate 

Y = 0 -1 (t/(l - 4>(L)) + O(L)). 

3. Plug in the generated Y into the simulation step of the payoff of the call and 
complete the analysis. Note that when evaluating the new estimator for the 
payoff, we need to multiply the factor 1 — O(L). That is, 

C* = (1 - <D(L))C, 

where C is the average of the simulated payoffs using the truncated normal 
random variables. 

In general, we would like to apply the stratification technique to bins in which the 
variability of the integrand is largest. In this case, we just focus the entire sample on 
the case S T > K. 

The results are given in Table 8.2. Please see the online material for the VBA 
codes. 
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TABLE 8.2 Effects of Stratification for Simulated Option Prices with Restricted 
Normal 


Bins (B) 

n b 

Mean (C) 

Std. Err. 

Adj . Mean 

SE 

1 

1000 

0.9744 

0.0758 

0.9842 

0.1102 

2 

500 

1.0503 

0.0736 

1.0303 

0.0823 

5 

200 

1.0375 

0.0505 

1.0235 

0.0524 

10 

100 

0.9960 

0.0389 

1.0101 

0.0404 

20 

50 

0.9874 

0.0229 

1.0058 

0.0238 

50 

20 

1.0168 

0.0146 

1.0147 

0.0153 

100 

10 

0.9957 

0.0092 

1.0089 

0.0095 

200 

5 

1.0208 

0.0094 

1.0160 

0.0099 

500 

2 

1.0151 

0.0062 

1.0143 

0.0066 

1000 

1 

1.0091 

NA 

1.0125 

NA 


8.4 CONTROL VARIATES 


The idea of control variates is very simple. Suppose that we want to estimate 0 = EX 
from the simulated data. Suppose that for some other variable Y, the mean /r } = EK 
is known. Then for any given constant c, the quantity 


X cv — X + c(l — lly) 

is also an unbiased estimate of 0, as E(X CV ) = 6. Presumably, if we choose the con- 
stant c cleverly, some form of variance reduction can be achieved. How can we do 
this? In other words, what would be a good choice of cl To answer this question, first 
consider the variance of the new estimator X cv , call it a 2 cy . 

a 2 CV = Var(X + c(Y - n Y )) = VarX + c 2 Vark + 2cCov(X, Y). 

We would like to find c such that er 2 ^ is minimized. Differentiate the preceding 
expression with respect to c and set it equal to zero, we have 

2c VarT + 2Cov(X, Y) = 0. 


Solving for such a c, we get, c* = — Cov(X, K)/Var Y as the value of c that minimizes 
err. For such a c*, 


j 2 „ = VarX - 


Cov 2 (X, Y) 
VarT 


The variable Y used in this way is known as a control variate for the simulation esti- 
mator X. Recall that Corr(X, Y) = Cov(X, K)/(VarXVar T) 1 / 2 . Therefore, 


<7 2 * = VarX(l - Corr 2 (X, Y)). 


CONTROL VARIATES 


121 


Hence, as long as Corr(X, F) / 0, some form of variance reduction is achieved. In 
practice, quantities such as rip = Var Y and Cov(X, Y) are usually not available; they 
have to be estimated from the simulations on the basis of sample values. For example, 
let X = £" =1 Xjn and Y = YU Y J n - Then 


n 


Cov(X, Y) = -U— Y(X, - X){Y l - Y), 

n — 1 “ 


n 



Cov(X, Y) 


Suppose that we use X from simulation to estimate 6. Then the control variate 
would be Y, and the control variate estimator is 


X + c*(Y- n Y ), 


with variance equaling to 



Equivalently, one can use the simple linear regression equation 


X = a + bY + e, e ~ i.i.d. (0, tr 2 ), 


( 8 . 2 ) 


to estimate c*. In fact, it can be easily shown that the least squares estimates of b, 
b = — c*, see Weisberg (1985). In such a case, the control variate estimator is given by 


X + c*(Y — fi Y ) = X — b(Y — jAy) = a + b^ Y , 


(8.3) 


where a = X — bY is the least squares estimate of a in Equation 8.2. That is, the 
control variate estimate is equal to the estimated regression equation evaluated at 
the point n Y . 

Notice that there is a very simple geometric interpretation using Equation 8.2. 
Firstly, observe that the estimated regression line 


X = a + bY 

= X + b(Y — Y). 


Thus, this line passes through the point ( Y,X ). Secondly, from Equation 8.3, 
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Suppose that Y < \jl y , that is, the simulation run underestimates p Y and suppose that X 
and Y are positively correlated. Then it is likely that X would underestimate E(X) = 9. 
We therefore need to adjust the estimator upward, and this is indicated by the fact that 
b = —c* > 0. The extra amount that needs to be adjusted upward equals —b( Y — p Y ), 
which is governed by the linear Equation 8.3. 

Finally, a 2 , the regression estimate of a 2 . is the estimate of Var(X — hY) = 
Var(X + c*Y). To see this, recall from regression that 


n 

= iy g 2 

= -j](X i -a-bY i ) 2 

n ti 

n 

= -J j (X l -(X-bY)-bYf 
U 1=1 
n 

1=1 

= -j]((X l -X) 2 -b 2 (Y,-Y) 2 ) 

77 


[=1 


= Var(Z) - ZrVar(F) 
= Var(X - bY). 


The last equality follows from a standard expansion of the variance estimate (see 
exercise 8.2). It follows that the estimated variance of the control variate estimator 
X + c*{Y - fi Y ) is a 2 In. 

Example 8.8 Consider the problem 9 = E(e u ) again. 

Clearly, the control variate is U itself. Now 

Cov(e u , U) = E(Ue u ) - E(U)E(e u ) 
xe c dx — (e — l)/2 
= 1 — (e — l)/2 = 0.14086. 

The second last equality makes use of the facts from the previous examples that 
E(U) = 1/2, Vart/ = 1/12, and Var(e £/ ) = 0.242. It follows that the control variate 
estimate has variance 



Var(e £/ + c*(U - 1/2)) = Var(e p )(l - 12(0.14086) z /0.242) = 0.0039, 
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resulting in a variance reduction of (0.242 — 0.0039)/0.242 X 100% = 98.4%. 

In general, if we want to have more than one control variate, we can make use of 
outputs from the multiple linear regression model given by 

k 

X = a + ^ bjYj + e, e ~ i.i.d. (0, a 2 ). 

7=1 

In this case, the least squares estimates of a and /?,s, a and fc,s can be easily shown to 
satisfy c* = —b r i = l, ... ,k. Furthermore, the control variate estimate is given by 

k k 

X + -l*i) = a+ ^ bill;, 

7=1 7=1 

where E(F ; ) = /q, i = 1, . . . , k. In other words, the control variate estimate is equal to 
the estimated multiple regression line evaluated at the point (p { , . . . , p k ). By the same 
token, the variance of the control variate estimate is given by a 2 / n. where a 2 is the 
regression estimate of a 2 . 

Example 8.9 Plunging along the same line, consider simulating the vanilla Euro- 
pean call option as in Example 8.6, using the terminal value S T as the control variate. 

The control variate estimator is given by 

C cv = C + c*(S t -E(S t )). 

Recalling that S T = S 0 e^ vT+C7 ^ zz \ it can be easily deduced that 

E(S r ) = S 0 e rT , (8.4) 

Var (S T ) = S 2 0 e 2rT (e° 2T - 1). (8.5) 

The algorithm goes as follows: 


1 . For i = l, ... ,N k , simulate a pilot of N 1 independent paths to get 


S T (i) = S 0 e vT+a ^ z ‘, 

C(i ) = e~ ,T max{0, S T (i) — K}. 


2. Compute E(S r ) as S 0 e rT or 


as S 2 0 e 2rT (e a T ■ 
covariance by 


estimate it by S T (i)/N l . Compute Var (S T ) 
1) or estimate it by _ 


■s T y 


Now estimate 


I 1 — - 

Co v(S r , C) = — ^(S T (i) - S T )(C(i ) - Q, 

1 7=1 

where C = ^ C{i)/N, and Y T = S T (i)/N x . 
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3. Repeat the simulations of S T and C by means of control variate. For 
i = l, ... ,N 2 , independently simulate 


S T (i) = S 0 e vT+a ^ z ', 

C(i) = e~ rT max{0, S T (i) - K}, 
C cv (i) = C(i) + c*(S T (i) ~ E (S T m, 


where c* = —Co v(S T , C)/Var.S' 7 - is computed from the preceding step. 
4. Calculate the control variate estimator by 



2 C cv( 0- 


Complete the simulation by evaluating the standard error of C cv and construct 
confidence intervals. 

Please see the online material for the VBA codes. 

For = 500 and N 2 = 50, 000, we have a 95% confidence interval for C cv of 
[1.0023 1.0247]. In this case, the estimated call price is 1.0135 with standard error 


0.0057. 


In using control variates, there are a number of features that should be considered. 

• What should constitute the appropriate control? We have seen that in simple 
cases, the underlying asset prices may be appropriate. In more complicated sit- 
uation, we may use some easily computed quantities that are highly correlated 
with the object of interest as control variates. For example, standard calls and 
puts frequently provide convenient source of control variates for pricing exotic 
options, and so does the underlying asset itself. 

• The control variate estimator is usually unbiased by construction. In addition, 
we can separate the estimation of the coefficients (c*) from the estimation of 
prices. 

• The flexibility of choosing the c,s suggests that we can sometimes make opti- 
mal use of information. In any event, we should exploit the specific feature 
of the problem under consideration, rather than generic applications of routine 
methods. 

• Because of its close relationship with linear regression, control variates are eas- 
ily computed and explained. 

• We have only covered linear control. In practice, one can consider using non- 
linear control variates, for example, XY / n Y - Statistical inference for nonlinear 
control may be tricky though. 
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8.5 IMPORTANCE SAMPLING 

After studying three variance reduction methods, we pursue one last method, namely, 
importance sampling. This method is similar in idea to the acceptance-rejection 
method that was discussed in Chapter 6. Its main idea lies in approximating at 
places where the quantity of interest carries the most information, hence the name 
importance sampling. This chapter then concludes with examples illustrating the 
different methods of variance reduction in risk management. 

Suppose that we are interested in estimating 

.-ewzh./ww* 

where X = (A 1; ... ,X n ) denotes an n-dimensional random vector having a joint p.d.f. 
f(x) =f(x ] , ... ,x n ). Suppose that a direct simulation of the random vector A is ineffi- 
cient so that computing li(x) is infeasible. This inefficiency may be due to difficulties 
encountered in simulating X, or the variance of h(x) being too large, or a combination 
of both. 

Suppose that there exists another density g(jc), which is easy to simulate and sat- 
isfies the condition that/(x) = 0 whenever g(x) = 0. Then 9 can be estimated by 

9 = E [h(x)] 

f h(x)f(x) 

=7 

= E \ h(x)f(xY 
s g(x ) ’ 

where the notation E„ denotes the expectation of the random vector X taken under 
the density g, that is, X has joint p.d.f. g(x). It follows from this identity that 9 can be 
estimated by generating X with density g and then using as the estimator the average 
of the values of h(X)f(X)/g(X). In other words, we could construct a Monte Carlo 
estimator of 6 = E (h(X)) by first computing i.i.d. random vectors X t with p.d.f. g(X), 
then using the estimator 

§ = i A TO 

n ^ S&d 

If a density g(x) can be chosen so that the random variable h(X)f(X)/g(X) has 
a small variance, then this approach is known as the importance sampling approach 
and can result in an efficient estimator of 9. 

To see how it works, note that the ratio/(A)/g(X) represents the likelihood ratio of 
obtaining X with respective densities / and g. If X is distributed according to g, then 
f(X) would be small relative to g(X), and therefore when X is simulated according 
to g, the likelihood ratio f(X)/g(X) will usually be small in comparison to 1. On the 
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other hand, it can be seen that 


f(X) 


g(X) 


fm g(x)dx =f 


f(x)dx = 1 . 


Thus, although the likelihood ratio f(X)/g(X) is smaller than 1, its mean is equal to 
1, suggesting that it occasionally takes large values and results in a large variance. 

To make the variance of h(X)f(X) / g(X) small, we arrange for a density g such 
that those values of X for which /(X)/g(X) is large are precisely the values for which 
li(X) is small, thus making the ratio h(X)f(X)/g(X) stay small. Because importance 
sampling requires h to be small sometimes, it works best when estimating a small 
probability. Further discussions on importance sampling and likelihood method are 
given in Glasserman (2003). 

Example 8.10 Consider the problem 6 = E(t/ 5 ). 

Suppose that we use the standard method 6 = - Y!,'=i £/?> then we oversample the 
data near the origin and undersample the data near 1 . It is easy to compute that 


Var(<9) = i {EC/ 10 - (Et/ 5 ) 2 } = W -) = (> '° 631 

n n 11 36 n 


Now, suppose we use the importance sampling, putting more weights near 1. Let 
g(x) = 5a 4 for 0 < x < 1 . Then 


0 , 


= E„ 


X 5 -l\ 

5X+ ) 


v; 

5 


The variance of this method is 

Var( 0/) = ^ {E ^ 2 - ( VO 2} 



0.00794 


n 

resulting a variance reduction of 98.74%. 

How do we choose g in general? This requires the notion of the so-called tilted den- 
sity. Recall that the notation M(r) = E(e lX ) represents the moment-generating func- 
tion (m.g.f.) of the random variable X with density/. 
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Definition 8.2 A density function 


fM ) = 


e rx f(x ) 
M(t) 


is called a tilted density of a given f, — oo < t < oo. 


Note that from this definition, a random variable with density f t tends to be larger 
than the one with density / when t > 0, and tends to be smaller when t < 0. 

Example 8.11 Let f be a Bernoulli density with parameter p. Then fix ) = p x ( 1 — 
p ) l ~ x , x = 0, 1. In this case, the m.g.f. is M{t) = E(e rX ) = pe 1 + (1 — p) so that 


fix) = — e tx f(x) 
Jr M(t ) 


M(t) 


(pe'Y (1 -p) 


pe 


l-x 


pe' + 1 — p ) \ pe 1 4- 1 — p 


1-/7 


Thus, the tilted density f is a Bernoulli density with parameter p t = pe 1 /{pe 1 + 1 — p). 

In many instances, we are interested in sums of independent random variables. In 
these cases, the joint density /( jc) of x = (x { , ... ,x n ) can be written as the product of 
the marginals f of x i so that 


fix) = /i(*i) ••■/„(*„)• 

In this situation, it is often useful to generate the A, according to their tilted densities 
with a common t. 

Example 8.12 Let X\, ... ,X n be independent with marginal densities f. Suppose 
that we are interested in estimating the quantity 

6 = P(S > a), 

where S = 2"=i an d a > Ya=i E(X,-) is a given constant. We can apply tilted den- 
sities to estimate 9. Let I{S > a} equal 1 if S > a and 0 otherwise. Then 

6 = E(/{S > a}), 

where the expectation is taken with respect to the joint density. Suppose that we sim- 
ulate Xj according to the tilted density function f ti , where the value oft>0 is to be 
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specified. To construct the importance sampling estimator, note that h(X) = /{.S' > a}, 
f(X) = \\.f\iXj), and g(X) = Wf, fX-fi The importance sampling estimator would be 


0 


= I{S > a 



N ow f i (X i )/f t i {X i ) = Mft)e tx ‘, therefore, 

0 = I{S>a}Y\Mi(t)e~ ,x ‘ 

i 

= I{S > a}M(t)e~ tS , ( M(t ) = ]”[ Mft)). 


As it is assumed that t > 0, S > a iff e tS < e ,a and 

I{S > a}e~ tS < e~ ra . 


so that 

9 < M(t)e~ ,a . 

We now find t > 0 such that the right-hand side of the aforementioned inequality is 
minimized. In that case, we obtain an estimator that lies between 0 and min, M(t)e~ ,a . 
It can be shown that such t can be found by solving the equation 

E,(S) = a. 

After solving for t, it can be used in the simulation. To be specific, suppose X l? ... ,X n 
are i.i.d. Bernoulli trials with p = p t = 0.4. Let n = 20 and a = 16. Then 

9 = I{S > a}e~ ,s + 1 — p). 


Recall from the preceding example that the tilted density f ,■ is the p.d.f. of a Bernoulli 
trial with parameter p* = pe r /(pe ! + 1 — p). It follows that 


E t (S) = 20 p* 


I 


pe l 

pe' + 1 - p 


Plugging in n = 20, p = 0.4, a = 16, we have 


20 - 


0.4<? r + 0.6 


= 16, 
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which leads to e f< = 6. Therefore, we should generate Bernoulli trials with parameter 
0.4e' /( 0.4e ' + 0.6) = 0.8 as the g and evaluate M(t*) = (0.4e r + 0.6 ) 20 and e~ r s = 
(l/6) s . The importance sampling estimator is now 

§ = I[S> 16 }M(t*)e~ ,ts = I[S > 16} 3 20 (1/6) s . 


Furthermore, we know that 

0 < M(t*)e~ , * a = 3 20 ( 1/6) 16 = 0.001236. 

Thus, in each iteration, the value of the importance sampling estimator lies between 
0 and 0.001236. 

On the other hand, we can also evaluate 9 = P(S >16) exactly, which equals to 
the probability that a Binomial random variable with parameters 20 and 0.4 be at 
least as big as 16. This value turns out to be 0.000317. Recall the function h(X) = 
I{S > 16}. This is a Bernoulli trial with parameter 6 = 0.000317. Therefore, if we 
simulate directly from Xs, the standard estimator 9 S has variance 

Var (9 S ) = 0(1 -0) = 3.169 X 10“ 4 . 

As 0 < 9 < 0.001236, it can be shown that 

Var(<9) < (0.001236) 2 /4 = 3.819 X lO" 7 , 

which is much smaller than the variance of the standard estimator 9 S . 

Another application of importance sampling is to estimate tail probabilities (recall 
at the beginning we mentioned that importance sampling works best in small prob- 
ability). Suppose that we are interested in estimating P(X > a), where X has p.d.f. / 
and a is a given constant. Let I(X > a) = 1 if X > a and 0 otherwise. Then 


P(X > a) = E f (I(X > a)) 


I(X > a) 


m 


g(X) 


= E„ 


I(X > a/-^-\X > a 
g(X) 

f(X) 

+E„ l(X > a) J ——J- \X < a 
g(X) 


P g (X > a) 
P„(X < a) 


= E„ 


m 


g(X) 


| X> a 


P S (X > a). 
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Take g(x) = Xe /U , x > 0, an exponential density with parameter X. Then the afore- 
mentioned derivation shows 


P(X >a) = E g [e xx f(X)\X > a]e~ x a /X. 

Using the so-called “memoryless property,” that is, P(X > s 4- t\X > s) = P(X > t ), 
of an exponential distribution, it can be easily seen that the conditional distribution 
of an exponential distribution conditioned on {X > a] has the same distribution as 
a + X. Therefore, 

p —X a 

P(X > a) = e —E g [e« x+a) f(X + a)] 

= \e y x f(x+a)i 

We can now estimate 9 by generating X { , ... ,X n according to an exponential distri- 
bution with parameter X and using 


9 = 


11 

X n 


£ e xx f(X l + a). 

i= 1 


Example 8.13 Suppose that we are interested in 0 = P(X > a), where X is standard 
normal. Then f is the normal density. Let g be an exponential density with X = a. Then 

P(X >a)= -E [e aX f{X + a)] 
a s 

- 1 p r p aX-(X+a) 2 

r— 
asjLK 


We can therefore estimate 6 by generating X, an exponential distribution with rate a, 
and then using 



to estimate 9. To compute the variance of 9, we need to compute quantities E Je x ~/ 2 ] 
2 ° 
and E g [e~ x ]. These can be computed numerically and can be shown to be 


E g [e- x2/2 } = ae a2,1 \[2n(\ — <D(a)), E g [e~ x2 ] = ae a2/4 - ®(a/ \fl)). 

y2 l r \ A ^ 4.5 a 

For example, if a = 3 and n = 1, then Var(e ' ) = 0.0201 and Var (9) = ( — -j=) X 

2>y2n 

0.0201 ~ 4.38 X 10 -8 . On the other hand, a standard estimator has variance 9( 1 — 
9) = 0.00134. 


Consider simulating a vanilla European call option price again, using the impor- 
tance sampling technique. Suppose that we evaluate the value of a deep out-of-money 
( Sq <sc K) European call option with a short maturity T. Many sampling paths result 
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S T < K and give zero-values. Thus, these samples are wasted. One possible way to 
deal with this problem is to increase the values of Z i s by sampling them from a dis- 
tribution with large mean and large variance. Sample Z, from N(-f=, s 2 ) so that 

try T 


a\frz i ~ N 0«, <7 2 Ts 2 ). 


Note that Z ; can be written as 


Z, = + sZ h Z,. ~ N(0, 1). 

C 7\T 


The importance sampling estimator is then given by 


C r 


-rT 


N 

e " y Z max{ V 

i=l 




-K,0}R(Z,), 


where 


R(Z t ) ■ 


vfe exp( -^-* )2) 


Zr z~ 

■ s exp(— ^ 

F 2 2 


Thus, Cj can be expressed as 


Cj = se~ rT - Yj mzx{SQe (r - a2l2)T+m+SB ^ Zi - K, 0}exp 


N 


l— 1 


'z? ( ^ + ^ )n 
2 2 


Example 8.14 Let S 0 = 100, K = 140, r = 0.05, a = 0.3, and T = 1. We simulate 
the value of this deep out-of-money European call option, using the importance sam- 
pling technique and compare it with the result of standard method. See the online 
material for the VBA codes. 

For N = 10, 000, we have C 7 = 3.1202 with standard error 0.0264 using impor- 
tance sampling while getting C = 3.0166 with standard error 0.1090 using standard 
method. The result shows that the importance sampling technique gives a more pre- 
cise estimate of the price of the option, which has a theoretical Black-Scholes price 
3.1187. 


8.6 EXERCISES 

1. Let U ~ 1/(0, 1) and let a and b be two given constants with a < b. Show that 
Y = a + (b — a) U is distributed as a U{a, b) random variable. 
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2. Let b be the least squares estimate of b in the simple linear regression model X = 
a + bY + e, e ~ (0, cr 2 ) i.i.d.. Show that 

Var(X - bY) = Var(X) - £ 2 Var(T). 

r\ 2 

3. Suppose that you want to estimate 9 = J Q e x ~ dx. Show that generating a ran- 
dom number U and then using the antithetic estimator (e'(\ + e l ~ w ))/2 is better 
than generating two random numbers U l and U 1 and using the standard estimator 
{e u Ue u l)/2. 

4 . Consider estimating 6 = J Q 4x dx. 

(a) Using standard simulation technique, estimate 0. 

(b) Using antithetic variable technique, construct an improved estimate of 0. 

(c) Using stratification, construct another estimate of 0. 

(d) Construct a control variate estimate of 0. 

(e) Compare the performance of these different estimates. 

(f) Can you combine the aforementioned methods to improve the result? 

5. Consider 6 = /,°°(x — 2)e~ x dx. 

(a) It is known that 6 = E[f{X)\ where X ~Exp(l). What is f(X)l 

(b) Provide an algorithm to sample X from the interval [2, oo). 

(c) Provide an algorithm to stratify X in the interval [2, oo) with equal probability 
1/4 for each stratified interval. 

(d) Provide a Monte Carlo algorithm using (X — 2) as the control variable. 

6. Redo Examples 8.6, 8.7, and 8.9 using S 0 = K = 100, r = 0.05, a = 0.1, and T = 
1. Calculate the theoretical Black-Scholes price as well. 

7. Verify Equations 8.4 and 8.5. 

8. Consider a truncated payoff v anilla call option with maturity T and strike price K. 
The payoff function is given by 


h(S T ) = 


S t -K if K <S T < S b , 
0 otherwise . 


The given constant S b acts as a barrier, canceling the option whenever S T > S b . 
Assuming that the stock price follows a geometric Brownian motion with v = 
r — a 2 / 2, where the risk-free rate r and the volatility a are known. Using the idea 
of antithetic variables, write a variance reduction algorithm to estimate the payoff 
function. 


The solutions and/or additional exercises are available online at http://www.sta. 
cuhk.edu. hk/Book/SRMS/. 
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9.1 INTRODUCTION 

Contingent claims other than standard call and put options are known as exotic 
options. The most common type of exotic options is path dependent options. As 
indicated by the name, the payoff of a path dependent option depends on the entire 
path of the underlying asset prices, not just the terminal asset price alone. According 
to this definition, American options are path dependent options because the option 
holder has to determine whether the options are worth to exercise at each time 
point. The path-dependent feature of an option usually complicates the analytical 
tractability of valuation. Simulation would be the most useful alternative. 

Owing to the need to value exotic options, this chapter studies simulation tech- 
niques for European and American style path dependent options. Some of the options 
considered in this chapter have no analytical solutions. 


9.2 BARRIER OPTION 

Barrier options have become increasingly popular nowadays. A barrier option is very 
much similar to a “vanilla” option, which becomes alive when the barrier is crossed. 
Let K be the strike price, T be the time to maturity, and V be the value of the barrier. 
A down-and-in barrier option becomes alive only if the stock price (usually count- 
ing only closing prices) goes below V before T. A down-and-out barrier option is 
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killed if the stock price goes below V before T. A down-and-in barrier call option is 
a cheaper tool to hedge against the upside risk. From the definition, it can be easily 
seen that holding both a down-and-in and down-and-out options with the same strike 
price K and maturity T is the same as holding a “vanilla” option. Let C di and C do 
be the option value of the down-and-in call and the down-and-out call, respectively. 
Then 

Q i + Qo = 

where C is the vanilla call price. Let 

” d 'I s - <’'! = { 0 t>v, 

be the realized minimum asset price and the indicator of the down-and-in option, 
respectively. Then, the value of the option can be written as 

C dl = e~ rT E{I{S min < V}(S(T) - K) + ], 

where E denotes the risk-neutral expectation. The other types of barrier options can 
be evaluated analogously. 

To simulate the value of a down-and-in call option, the algorithm goes as follows: 

1 . Generate the daily stock price .S'(?| ), S(t 2 ), ■■■ , S(t n = T). If min ; S(tf < V, then 
set 

C=e~ rT max(.S'(7'j - K, 0), 

else set C = 0. 

2. Repeat Step 1 N times to obtain C u , C N . The value of the down-and-in call 
option is given by 



and the standard error of the estimator is given by 


N(N- 1) 




C) 2 


Example 9.1 Let S Q = 1 0, r = 0.23, a = 0.4, and dt = 1/250. Compute the value 
of a down-and-in call option with strike price K = 12, maturity T = 1, and barrier 
V = 9. Please see the online material for the VBA codes. 

For N = 10, 000, we get C = 1.0273, and the standard error of C is 0.02048. The 
95% confidence interval for C is [0.9872,1.0675]. 
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9.3 LOOKBACK OPTION 

The payoffs of lookback options depend on the maximum or the minimum stock price 
during the life of the option. Denote the maximum (minimum) of the stock price over 
the time period [0, T] by S max (T) (S min (T)). Four popular lookback options are as 
follows: 


1. Floating strike lookback call (cy,): payoff = S T — S min (T); 

2. Floating strike lookback put (pji)'- payoff = S max (T ) - S T \ 

3. Fixed strike lookback call (c^ A ): payoff = max(5'„ ra;[ (7’) — K, 0); 

4. Fixed strike lookback put (pf LK )'. payoff = max(K — S min (T), 0). 

There are lookback put-call parities connecting the floating strike lookback call 
(put) to the fixed strike lookback put (call). Specifically, four put-call parities of look- 
back options are as follows: 

1 ■ S min (t. )) = S- e-^-‘>S mm (t) + p fix (t, S, S min {t)\ K = S min {t))- 

2. Pfl ( t , 5, S m Jt)) = e-'V-VSnJf) ~S + Cfix (i t , S , S max (t)\ K = S m Jt )) ; 

3. Cfix (t, S , S max (t)- K)=S- e~^K + p fl (t, S, ma x(S max (t), K)) ; 

4. Pfix (t, S, S min (t)- K) = e-« T -‘>K -S + c fl (t, S , min (S min (t), K)). 


These four put-call parities are model independent, meaning that they are applicable 
to any asset dynamics. For a proof, we refer to the article of Wong and Kwok (2003). 

Pricing lookback options with simulation is very similar to that of the bar- 
rier option. Consider the floating strike lookback call option. The VBA code of 
Example 9. 1 can be modified to obtain the lookback option price. We just compute 



Other lookback options are valued in the same manner. 

It is interesting to notice that simulating fixed strike lookback options requires 
less storage than simulating the floating ones. The reason is that payoffs of fixed 
strike lookback options do not depend on the terminal asset price, S T . Therefore, 
after generating a sample path, only the maximum or minimum price of the path 
is required. With this observation and the lookback put-call parities, a storage- saving 
approach to simulating floating strike lookback options can be developed. For valuing 
a floating strike lookback call, a fixed strike lookback put with strike price equaling 
to the realized minimum asset value is simulated. Then, the floating strike call price 
is extracted from the first put-call parity. 
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9.4 ASIAN OPTION 

Asian options payoffs depend on the average of the underlying asset prices during the 
option life. Asian options are popular in the financial industry because they cost less 
than their vanilla counterparts and are less sensitive to the change in underlying asset 
prices. The common forms of averaging in option contracts can be either geometric 
average or arithmetic average of the underlying variables. Denote the geometric aver- 
age and arithmetic average of the underlying asset in the period [0, T] by G T and A T , 
respectively. Then, 



T 


log S(t) dt 


(9.1) 



For geometric Asian options, analytical pricing formulas are available in the liter- 
ature; see for example Wong and Cheung (2004). However, almost all Asian options 
are traded with arithmetic average. For instance, two frequently traded Asian options 
are as follows: 

1. Floating strike Asian call. Payoff = max(S r — A r , 0); 

2. Fixed strike Asian call. Payoff = max(A r — K, 0). 

In practice, the geometric Asian option prices are used as a control variate in simu- 
lating their arithmetic counterparts. 

Let us illustrate the procedure by considering a fixed strike Asian call. The geo- 
metric version of the option has the payoff max(G r — K, 0). Denote X T by log G T , 
that is, 



By Ito’s lemma. 


logS,. = logS r + v(t — t) + o(W(t) — W(t)) (Recall: v = r — <r 2 /2), 


which implies 
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v t , T-t, p , (T - 1) 2 a 

z ^tj " l j— + v — 2j *" j 


T(W t - W t ) ■ 


i: 


t clW r 


t T-t , 


{T-tf 


x, t H — t~ i° s5f + v 2T 


°rl 


+ ^~ / (T-T)dW T , 


where the second last line uses the integration by parts formula; see Example (4.2). 
By Ito’s identities, see Exercise 1(d) in Chapter 4, we have 


Var 


i: 


z)dW( r) = 0 and 

T 


(T - t) dW(r) 


/ 


( T — t) dr 




Therefore, 


~ N ( jX, + T -^- logS, + 


Risk-neutral valuation asserts that 


C f G {t, S , G t ) = e- r(T - l) E [max(e x r _ 0 )] . 

Applying Lemma 5.1, we obtain the closed form solution as 


(9.2) 


C J G (t, S, G t ) = S ( y V e R(uT) d KdO - Ke- r(T ~ t) <t>(d 2 ), 


(9.3) 


where 


T , S , .. G, , , a 2 , (T-t) 2 , 2 (T-tf 
Tlog - + rlog f + (r - t )4- + ^ V- 




> (T-0 3 


t/ 2 — 


, (T - ty 


(9.4) 


3T 2 ’ 

„„ _ ( o 2 \(T-t) 2 , 2 (T-t ) 3 ^ 

R(t; T) = \ r V c>- r(T - t). 

\ 2 J 2T 6 T 2 

With the analytical solution of the geometric Asian call (GAC), we simulate the 
arithmetic Asian price via control variate. The algorithm is presented as follows. 


Step 1 : Generate daily stock prices 5(1]), S(t 2 ), ... , S(t n ). 

Step 2: Set 

i 

n 

, C j G = e- r(r ~ ,) max(G ; . - K, 0), 


G j : 


A** 

i=i 
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Aj = - S(t { ), C\ = e~ r(:r - ,) max(Aj - K, 0). 

n i= 1 

Step 3: Repeat Steps 1 and 2 N times. 

Step 4: Compute the regression coefficients a and b by fitting 

d A =a + bd G , j = 1,2, ... ,N. 

Step 5: d* A = a + b C^(t, S, G t ) with formula (9.3) applied. 

Example 9.2 Consider the parameter values: S, = 10, r = 0.03, a = 0.4, / = 
0.2, 7=1, and the realized arithmetic average A t = 10.5. Simulate the arithmetic 
Asian call option with a fixed strike price of $12. Please see the online material for 
the VBA implementation. 

This simulation gives the arithmetic Asian call (AAC) price to be 0.1698. The 
analytical price for the GAC is computed as 0. 1 3 1 8. The AAC is a bit more expensive 
than the GAC because the arithmetic mean always dominates the geometric mean. 
The computational time is about 10 s. 


9.5 AMERICAN OPTION 

American options allow the holder to exercise before maturity. This early exercise 
feature exists in major financial markets. The valuation and optimal exercise of Amer- 
ican options is one of the most challenging problems in derivatives finance, especially 
when more than one factor is involved in the option contract. 

Although simulation techniques can be used to generate future scenarios, the for- 
ward looking feature of simulation complicates the valuation of American option, 
where optimal exercising policy have to be constructed via backward reduction. When 
an American put option is valued with binomial tree, one has to determine if it is opti- 
mal to exercise the option at each node in a backward manner. A practical approach 
to valuing American options with simulation is proposed by Longstaff and Schwartz 
(200 1 ). This section presents the idea of American option pricing using this approach. 


9.5.1 Simulation: Least Squares Approach 

The best way to illustrate the least squares approach of Longstaff and Schwartz (200 1 ) 
is by means of a concrete example. In the following numerical example, we introduce 
the algorithm in detail first and explain the concepts later. 

Example 9.3 Let 5(0) = 1 0, r = 0.03, n = 0.4. Compute the value of an American 
put option with strike price K = 12 and maturity 7=1. For simplicity, assume that 
the option can be exercised at t = 1/3, 2/3, and 1. 
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TABLE 9.1 Sample Paths 


Path 

f = 1/3 

f = 2/3 

f = 1 

= max(k-5(l),0) 

1 

8.3826 

9.9528 

6.7581 

5.2419 

2 

11.9899 

13.8988 

14.5060 

0 

3 

13.1381 

17.4061 

13.4123 

0 

4 

6.8064 

7.8115 

10.6520 

1.3480 

5 

7.0508 

9.1293 

7.4551 

4.5449 

6 

11.2214 

8.3600 

9.2896 

2.7104 

7 

8.9672 

8.7787 

9.0822 

2.9178 

8 

11.5336 

10.9398 

8.6958 

3.3042 


TABLE 9.2 

Regression at t = 2/3 


Path 

Y 3 e- rAt 

5(2/3) 

Exercise in-the-Money? 

1 

5.1898 

9.9528 

Yes 

2 

— 

13.8988 

No 

3 

— 

17.4061 

No 

4 

1.3346 

7.8115 

Yes 

5 

4.4997 

9.1293 

Yes 

6 

2.6834 

8.3600 

Yes 

7 

2.8888 

8.7787 

Yes 

8 

3.2714 

10.9398 

Yes 


We use the formula 5(f + Af) = 5(f) exp[(r — <j 2 /2)A f + a A W r ] to generate asset 
prices at exercise time points: f = 1/3, 2/3, and l.Table9.1 gives eight sample paths. 
Terminal payoffs corresponding to each path, Y 3 , are given by the last column of the 
table. Discounting the sample mean of the terminal payoffs estimates the European 
put price to be $2.4343. This is a lower bound for the American put option. 

At time f = 2/3, the option holder must decide whether to exercise the option 
immediately or to continue the option when the option is in-the-money. To make 
the decision, the holder should compare the cash flows of immediate exercise with 
the expected payoff of continuation given the asset price at time 2/3. Therefore, it 
is essential to estimate the conditional expected payoff. To do this, we collect the 
response variable Y 3 e~ rAt and the explanatory variable 5(2/3) for in-the-money paths 
in Table 9.2, where Af = 1/3. We model the expected payoff from continuation at 
time f = 2/3 as a quadratic polynomials, / 2 (5,), of asset values at time f = 2/3. Coef- 
ficients of the polynomials are estimated from the data in Table 9.2 by the least squares 
method. Therefore, we estimate a 0 , «, and a 2 from the regression line: 


Y 3 e~ rAt = a 0 + fij [5(2/3)] + a 2 [5(2/3)] 2 + e. 
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The resulting formula is 

E[y 3 <r rA '|5(2/3)] = -82.5347 + 17.7788[5(2/3)] - 0.9063[5(2/3)] 2 :=/ 2 (5). 


With this conditional expectation function,/) (5), we are able to compare the value 
of immediate exercise, K — 5(2/3), and compute payoffs, Y 2 , for each path at t = 2/3. 
The value of Y 2 is obtained by the formula, 


T 2 = 


K - 5( 2/3), if K - 5(2/3) > / 2 (5( 2/3)), 
e~ A, Y 2 , otherwise. 


This formula asserts that the payoff at time t = 2/3 is K — 5 if exercising the option 
is worth more than the expected payoff from holding it; otherwise, the payoff at time 
2/3 becomes the discounted cash flow in the next exercise time. The last column of 
Table 9.3 gives the expected payoffs, Y 2 , for each sample path. 

Next, we repeat the procedure for t= 1/3. In Table 9.4, all sample paths are 
in-the-money except path 3. Then, the least squares estimation corresponding to 
in-the-money paths gives 


E[Y 2 e~' A, |5(l/3)] = -8.9488 + 3.31045(1/3) - 0.2036[5(l/3)] 2 :=/ 3 (5). 


This regression function determines the exercising policy at t = 1 /3. 


TABLE 9.3 Optimal Decision at t = 2/3 


Path 

Exercise 

K - 5(2/3) 

Continuation 
/ 2 (5( 2/3)) 

e~ rA, Y i 

r 2 

1 

2.0472 

4.6380 

5.1898 

5.1898 

2 

— 

— 

0 

0 

3 

— 

— 

0 

0 

4 

4.1885 

1.0428 

1.3346 

4.1885 

5 

2.8707 

4.2388 

4.4997 

4.4997 

6 

3.6400 

2.7554 

2.6834 

3.6400 

7 

3.2213 

3.6959 

2.8888 

2.8888 

8 

1.0602 

3.4968 

3.2714 

3.2714 

TABLE 9.4 Regression at t = 1/3 

Path 

Y 2 e~ rA ' 

5(1/3) 

Exercise in-the-Money? 

1 

5.1381 

8.3826 

Yes 


2 

0 

11.9899 

Yes 


3 

0 

13.1381 

No 


4 

4.1468 

6.8064 

Yes 


5 

4.4549 

7.0508 

Yes 


6 

3.6038 

11.2214 

Yes 


7 

2.8600 

8.9672 

Yes 


8 

3.2388 

11.5336 

Yes 
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TABLE 9.5 Optimal Decision at t = 1/3 


Path 

Exercise 

K - 5(1/3) 

Continuation 
/i (5(1/3)) 

e~ rA, Y 2 


1 

3.6174 

4.4921 

5.1381 

5.1381 

2 

0.0101 

1.4689 

0 

0 

3 

— 

— 

0 

0 

4 

5.1936 

4.1494 

4.1468 

5.1936 

5 

4.9492 

4.2688 

4.4549 

4.9492 

6 

0.7786 

2.5572 

3.6038 

3.6038 

7 

3.0328 

4.3620 

2.8600 

2.8600 

8 

0.4664 

2.1440 

3.2388 

3.2388 


Once again, the F, in Table 9.5 is computed according to the optimal decision by 
the rule. 


Yi = 


K - 5(1/3), iftf-S(l/3)>/ 1 GS(l/3)), 
e~ A, Y 2 , otherwise. 


Finally, the current price of the American option is estimated by the average of 
e -rAi y ! , that is, $3.0919, which is higher than the European option price $2.4343. 


9.5.2 Analyzing the Least Squares Approach 

Consider an American put option with exercise rights at tj < • • • < t n = T. To sim- 
plify matters, we assume *7+1 “ h = At for/ = 1 , 2, . . . , n — 1 . Given a sample path of 
the underlying asset price, {5(7 j), S(t 2 ), ... ,5(7,,)}, we study possible payoffs received 
by the option holder at each of the exercise time points. Clearly, if the option is 
not exercised prematurely, then the holder receives the terminal payoff, denoted as 
Y n = max (A" — S(t n ), 0). At time t = t n _ ] , the corresponding payoff, Y n _ 1 , depends on 
the holder’s decision of exercising the option. Therefore, 

_ f A'-5(/„_ 1 ), exercise, 
n ~ l e~ rAt Y n , continue. 

This formula indicates that the option holder receives K - S(t n _ { ) if the optimal deci- 
sion is to exercise the option. Otherwise, the holder will receive a cash flow of Y n at 
the next time step. The present value of this cash flow is obtained through multiplying 
a discounted factor e~ rAr . Inductively, the payoff Y- at time tj can be described as 


{ K — S(tj), exercise, 
e~ rAl Yj +] , continue. 


(9.5) 


This iterative process stops until Y l is obtained. As the option holder has no exercise 
right in the time period [0, t x ), the American put option can be viewed as a European 
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option that expires at t t with payoff Y t . Risk-neutral valuation allows us to value the 
American put, P A ( 0, S), as 


P A (0,S) = E\e- rl ' Y, |.S 0 = A] . 

Therefore, a typical simulation algorithm generates N sample paths; each follows the 
algorithm to obtain { Tj 1 , Kj :Vl } . The American put is estimated by 

1 N 

p a (o, s ) = (9 - 6) 

The aforementioned simulation is incomplete, however. To simulate the American 
put, the payoff, Y ] , at time f, should be obtained via simulation. This requires the 
simulation algorithm to detect optimal exercise at each time point successively. In 
other words, we have to clarify the condition of exercising the option in Equation 
9.5. It is crucial that the optimal decision should not be made by simply comparing 
the values of K — S(tj) and e~ rAl Y J+l in Equation 9.5. The reason is that the decision 
at time tj should be based on the information up to t f However, the value >Vi depends 
on the asset value at tj+i- The correct approach is to compare the immediate exercise 
cash flow K — S(tj) with the expectation on the discounted cash flow conditional on 
the asset price S(tj). This leads (Eq. 9.5) to 


f K - if K - S(tj) > 

j 1 e~ rAt Y j+l , if K-S(tj) <fj(S(tj)), 

wher e fj(S(tj)) is the conditional expectation function at tj, that is, 

f j (S(t j )) = E[e- rA, Y j+l \S(tj)}- 


(9.7) 


(9.8) 


The key to the Longstaff and Schwartz (2001) approach is the use of least squares 
to estimate the function, fj(S). Under certain technical conditions, it can be shown that 
the function fj(S(tj)) can be approximated by a polynomial of S(tj). In other words, 


00 

fj(S(t i )) = J j a k [S(t J )] k , 

k = 0 

where {a k } converges to zero rapidly. Therefore, one way to approximate /^.S’j is by 
truncating the polynomial of infinite order to a finite order polynomial. Coefficients 
of the finite order polynomials are estimated through the least squares method. 

In Example 9.3, we use a polynomials of degree 2 to approximate fj(S). The sim- 
ulation starts by generating N asset price paths, { S i (t ] ),..., S t ( t n )} for i = 1,2 , ,N. 
When t = t n , it is clear that Y; A = max [A" — S,(f n ), 0] for the path i. We go one step 
back to the time point t = t n _ l , where N possible asset prices have been generated. 
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Then, the coefficients a 0 ,a { , and a 2 are obtained by taking least squares estimation 
to the regression line: 

fn- t(S) = E [e~ rA %\S\ =a 0 + « 1 [5(?„_ 1 )] + a 2 m n _,)} 2 . (9.9) 

The estimation is based on the sample {(S,(f„_i), y^lA' > S,(f„_i), i = 1, ... , N}, 
that is, in-the-money paths. Then, payoffs at t n _ 1 are calculated via the rule in 
Equation 9.7. Having a sample of payoffs { \i = 1, 2, ... ,N] at t n _ j, we go one 
step back to the time point t n _ 2 and repeat the process. Eventually, we obtain N 
possible payoffs, { TJ |/ = 1,2, ...,1V}, at t ] . Monte Carlo simulation estimates the 
current option price by the average in Equation 9.6. 

Remarks 

1. In the regression Equation 9.9, only in-the-money paths are used in the least 
squares estimation, as these paths are sensitive to immediate exercise. Remem- 
ber that the option holder will exercise the option only when it is in-the-money. 

2. An obvious way to improve the accuracy is to increase the number of terms 
in Equation 9.9. However, one has to strike a balance between increasing the 
number of terms and the quality of estimates. Numerical experiments show that 
polynomials of degree 3 are a reasonable choice. 

3. Instead of using ordinary monomials as basis functions in Equation 9.9, one 
may consider other basis functions, such as Hermite, Laguerre, Legendre, 
Chebyshev, Gegenbauer, and Jacobi polynomials. Numerical tests of Moreno 
and Navas (2003) show that the least squares approach is quite robust to 
the choice of basis functions. For more complex derivatives, this choice can 
slightly affect option prices. 

4. The recent analysis of Stentoft (2004) indicates that a modified specification 
using ordinary monomials is preferred over the specification based on Laguerre 
polynomials used in Longstaff and Schwartz (2001). Furthermore, the least 
squares method is computationally more efficient than other numerical meth- 
ods, such as finite difference, especially when high dimensional problems are 
concerned. 

5. The article by Longstaff and Schwartz (2001) points out that the R 2 values of 
the regressions are often low. This means that the volatility of unexpected cash 
flows is large relative to the expected cash flows. However, because the least 
squares simulation is based on conditional first moments rather than higher 
moments, the R 2, s of the regression should have little impact on estimated 
American option price. 

6. If the user is really concerned about the R 2 , it may be more efficient to use other 
techniques such as weight least squares and generalized method of moments 
(GMM) in estimating the conditional expectation function. 
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Figure 9.1 The exercising region of the American put option. 


Example 9.4 Using the parameters in the preceding example, simulate the 
American put price with continuous exercise rights and hence determine the optimal 
exercise policy. The simulation is based on 10,000 sample paths with At = 1/100. 
The online materials provide the VBA codes for the simulation. 

By using quadratic conditional expectation functions, our simulation estimates the 
American put price as 2.739 within 15 s, which is consistent with the binomial model 
of Hull (2006). For the early exercise policy, we collect the maximum asset value 
that belongs to the exercising region at each time. For t > 0.2, Figure 9.1 plots the 
exercise policy against time. The option is optimal to exercise if the stock price falls 
into the shaded region. It is seen that the early exercise boundary looks similar to 
an increasing function of calendar time and hence a decreasing function of option 
maturity. For t < 0.2, our simulation has no path in the exercising region so that we 
are unable to graph the exercising boundary. 

9.5.3 American Style Path Dependent Options 

The examples considered so far are relevant to pricing American put options; the least 
squares approach is applicable to any early exercisable contingent claims. Denote 
the terminal payoff function of a path dependent option by F(S T , S, T ) where £ is an 
exogenous variable. For instance, £ T = S min (T ) for a barrier option or a lookback 
option and % r = A T for an Asian option. The American style path dependent option 
with payoff F(S T , i; r ) can be simulated as follows. 

Step 1: Generate asset price paths {Sft^, Sftf ), ... , S,(f„)} for i = 1,2, ... , N. Set 
j = n- 1 and Y’ n = F(Sft n ), £;(?„)). 
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Step 2: Use least squares to estimate coefficients of a polynomials of degree m, 
KXSiitj), &(.$) from: 


e- rAt Y‘ +1 = PmWtjMty), 

for in-the-money paths. 

Step 3: If £(';)) > then set Y! = {,($); otherwise, 

set Y' = e-'^Yi , . 

J 7+1 

Step 4: If j > 1, then set j = j — 1 and go to Step 2. 

Step 5: The American option price = ^ 2,=i e _rA 'FJ. 

Example 9.5 Suppose that S 0 = 10, r = 0.03, a = 0.4, and T = 7 / 1 2 (7 months ). 
Simulate the American style floating strike arithmetic Asian put option and plot the 
optimal exercise regions for t = 0.2, 0.4, 0.6, and 0.8. The simulation is based on 
10,000 sample paths with At = 1/100. 

We approximate the conditional expectation function, f(S, A), by a two-variable 
quadratic polynomials, that is, 

fj(S,A) = Oqq 4 GEjQt S 4 O2o S~ 4 a^SA 4 Oq^A T Oq 2 A - . 


Our simulation estimates the option price to be 9.783. This number is consistent 
with the one obtained by the finite difference method (FDM) in Hansen and Jorgensen 
(2000). The CPU (central processing unit) time is about 17 s for the computation. 
Figure 9.2 plots the exercise boundaries at time 0.2, 0.4, 0.6, and 0.8. The boundaries 
are the interfaces between shaded and nonshaded regions. The shaded regions are 
those of the continuation regions. For t = 0.2, there are less points falling into the 
exercising region. Thus, the simulation is only able to graph the exercise boundary 
for underlying asset prices in the range of 7-1 1 at t = 0.2. 


9.6 GREEK LETTERS 

As pointed out in Chapter 7, hedging is sometimes more important than pricing in 
risk management. Option hedging requires risk managers to compute option Greeks, 
such as delta, gamma, vega, and theta. We refer interested readers to Hull (2006) for 
the application of Greeks in hedging and Joshi (2003) for discrete tree approximation. 
The Greek letters are actually representing partial differentiations of the option pric- 
ing formula with respect to different parameters. Because most options, especially 
path dependent options, do not have closed form pricing formulas, Greeks should 
be obtained by means of simulation. For single asset path-independent options, the 
simulation can be constructed via Theorem 7.3. However, it is inapplicable for path 
dependent options. Thus, we introduce an alternative practical approach to simulating 
Greeks. 
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Figure 9.2 Exercise regions of the American style Asian option. 


Let V denote the pricing formula of an option. The option Greeks are defined as 
follows. 


Delta 

Gamma 

Vega 

Theta 

Rho 


av. 

dS’ 
d 2 V _ 
dS 2 ' 
dV_ 
da ’ 
dV_ 
dt ’ 
dV 
dr' 


where S is the underlying asset price, a is the volatility, t is the time variable, and r is 
the spot interest rate. Hence, the Greeks can be obtained by standard differentiation 
techniques or approximated by the numerical FDM if the option pricing formula is 
available. The FDM computes numerical differentiation by approximating the first 
principle in differentiation. For instance, suppose that we are interested in the Delta 
of an option. Then, the FDM approximates the value by 


Delta ~ 


V(S + h) - V(S) 
h 


(9.10) 


where h is an arbitrarily chosen small number and other parameters are fixed. 
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The approach introduced here combines simulation with FDM together. Suppose 
that we need the Delta of an option. Then we proceed as follows. Firstly, the option 
price is simulated as usual with the current realized asset price S. Secondly, we 
re-simulate the option price again with a “perturbed” asset price S + h. Finally, the 
Delta is approximated by Equation 9.10. However, the stability of this approach 
would be of great concerns because there are two sources of errors: simulation 
error and FDM error. The most critical one is the simulation error, which makes 
the numerator of Equation 9.10 nonzero even when h tends to zero. To circumvent 
this difficulty, it is very common for market practitioners to use the same set of 
random numbers in the first and the second steps. We illustrate these ideas with the 
down-and-out call option in the following example. 

Example 9.6 Suppose that 5(0) = 1 00, r = 0.05, a = 0.4, and 7 = 1 (1 year). Esti- 
mate the delta of down-and-out call option with a strike price of 95 and provision on 
a downside barrier of 80. 

We base our simulation on 100,000 sample paths, each of which is divided into 100 
equally spaced intervals. Therefore, this simulation requires 10 million independent 
normal random variables, namely e y with i = 1,2, , 100 and j = 1,2, ... , 100, 000. 
Using the set of { e- j , we produce the sample paths as {S •(*,•), ... ,5^(f 100 )} using the 
Black-Scholes dynamics of asset price with £ ■( 0) = 100 for all j. Therefore, we get 
the C do price as in Section 9.2. To obtain delta, we repeat the aforementioned proce- 
dure by assuming SfO) = 100 + h, where h = 0.01, to estimate the option price again. 
It is important to recall that we must use the same set of ey. After that, the delta is 
approximated by the FDM. Our simulation estimates the delta of the down-and-out 
call option to be 0.863. Figure 9.3 shows the distribution of the delta estimates over 
100 simulations. Please see the online material for the VBA codes. 



Estimated delta 


Figure 9.3 The strike against the delta of a down-and-out call option. 
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Other Greek letters can be obtained in a similar manner. For instance, the Gamma 
is the second order partial differentiation of the option pricing formula with respect 
to the underlying asset price. To estimate its value, we can approximate the second 
order differentiation by central finite differencing such that 


Gamma ~ 


V(S + h) - 2 V(S) + V(S - h) 

Jp- 


Therefore, we are required to compute V(S — h) on top of V(S) and V(S + h). 


Example 9.7 Using the input parameters in Example 9.6, plot the gamma of 
down-and-out call option against strike price, where the strike price varies from 93 
to 110. Please see the online material for the VBA codes. 


9.7 EXERCISES 

1. Verify Equations 9.1 and 9.3. 

2. By modifying Example 9.1, simulate the price of down-and-out call, which will 
be knocked out if the underlying asset price goes below $8. 

3. By modifying Example 9.1, simulate prices of a fixed lookback put option and 
a floating lookback call if the fixed strike price and the realized minimum asset 
prices are both $8. Verify the lookback put-call parities of these options. 

4. Show that American call option price equals to that of its European counterpart 
if the underlying asset pays no dividends. In other words, American call option is 
never optimal to exercise before maturity if the underlying asset pays no dividends. 

5. By modifying Example 9.4, simulate the price of an American call option with 
strike of $12 and a dividend yield 8 of 4%. Hint: The risk-neutral dynamics of an 
asset paying continuous dividend yield is given by 

dS , ,, , 

— = (r - 8) dt + t 7 dW. 

S 

What is the optimal exercising policy from your simulation? Plot the critical asset 
prices against time. 

6. Forward start option is a path dependent option that the strike price will be set as 
the underlying asset price in the future. For instance, the forward start call option 
payoff is 

max(5 r - ,S' (| , 0), 


where 0 < t l <T. 

(a) Suppose S 0 = $10, a = 0.4, r = 0.1, 8 = 0.5, T = 0.5 and t l = 0.3. Construct 
and implement an algorithm for the forward start call option with 1 ,000 sample 
paths. 
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(b) Denote C BS (S, t\ K, T) by the Black-Scholes formula for the standard call 
option. On the basis of financial insights, a risk analyst speculates that the 
forward start call option is the discounted standard call price. That is 

Current forward start call price = e~~"' C BS (S 0 , fj ; S () , T). 

Verify this conjecture by your simulation. 

(c) Suppose that the option has a continuous early exercise right after t=t x . 
Determine the option price by the least squares simulation with 10,000 sample 
paths. 

The solutions and/or additional exercises are available online at http://www.sta. 
cuhk.edu. hk/Book/SRMS/. 
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10.1 INTRODUCTION 

Multiasset options are exotic options whose payoffs depend on values of multiple 
assets. Multiasset options abound in the financial market. An obvious example is 
index options, where the underlying variable, the financial index, can be thought of 
as a portfolio of multiple assets. Challenges of valuing multiasset options are the 
curse of dimensionality and the lack of analytical tractability. These problems can be 
circumvented by simulations. 

Some examples of multiasset options traded in the financial market are first intro- 
duced. Let S l ,S 2 , ■ ■ ■ , S n denote the prices of n different assets. 

1 . Exchange Options. The right to exchange an asset for another. Thus, the option 
payoff is max(5j — cS 2 , 0), where c is a constant multiplicative factor. This 
option is useful, for example, when a U.S. investor wants to buy Japanese yen 
with eurodollars. 

2. Quanto Options. Options on stocks in a foreign country, that is, involving the 
exchange rate. If we treat ,S'| as the exchange rate and S 2 as the underlying asset 
in the foreign country, then there are a number of possible quanto option pay- 
offs, such as .S', max(5 , - 1 — K, 0), max (5, S 2 — K, 0), max(S 1 , C), max(.S - 2 — K, 0), 
and C max(.S’ 2 — K, 0), where C is a fixed constant. The last payoff function 
appears to be of a single asset option. However, the volatility of the exchange 
rate, S 1; does contribute to the option price if .S - | and S 2 are correlated. 


Simulation Techniques in Financial Risk Management, Second Edition. Ngai Hang Chan and Hoi Ying Wong. 
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3. Basket Options. Options S on a portfolio. The payoff of a call on a portfolio is 
max(n — K, 0), where II = £" = i a fii- 

4. Extreme Options. Options on the extrema of different assets. The maximum 
call option has the payoff: max [max(5[ , S 2 , ■■■ , S n ) — K, ( )| . 

All multiasset options can be traded with European or American style. Complex mul- 
tiasset options, or structured products, may even involve path-dependent features. In 
such cases, simulations are indispensable. 


10.2 SIMULATING EUROPEAN MULTIASSET OPTIONS 

Consider an option on two assets with payoff F(S 1 (7’), S 2 (T)). In the risk-neutral 
world, assets are assumed to follow the dynamics of 

dS: 

— = rdt + OjdWj, i= 1,2, (10.1) 

where 


E(dW 1 dW 2 ) = pdt, (10.2) 

and E denotes the risk-neutral expectation. Then, the option can be simulated via the 
Cholesky decomposition (Theorem 6.4). 

Example 10.1 Suppose that 5,(0) = S 2 (0) = 10, = 0.3, a 2 = 0.4, p = 0.2, and 

r = 0.05. Simulate the price of an exchange option with maturity of 6 months. 

By Ito’s lemma, we derive the terminal asset prices as 

5j(T) = 5 1 (0)e (r_<7 r /2)r+ ' TlXlV ^ and S 2 (T) = S 2 (f))e (r ~ a l l2)T+a2X ^ , (10.3) 

where 



The option price, C x , can be determined by evaluating the expectation: 

C x = e~ rT E [max(5 1 (T) - 5 2 (T),0)] . 

We estimate the option price by the following simulation algorithm. 

Step 1 : For i = 1 to N, perform Steps 2-4 as follows: 

Step 2: Generate Z 1? Z 2 ~ N(0,1) i.i.d. (identical and independent distributed) 
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0.90 0.92 0.94 0.96 0.98 1.00 

Price 

Figure 10.1 The distribution of simulated price. 


Step 3: Set X t = Z, and X 2 = pZ l 4- \/ 1 - p 2 Z 2 . 

Step 4: Compute S^(T) and S ( ^(T) by Equation 10.3. 

Step 5: Set C x = e ^- £' V =1 max( S^(T) ~ Sf(T), 0). 

Figure 10.1 plots the distribution of the estimated price over 100 simulations. We 
obtain the estimated option price to be 0.962. 


10.3 CASE STUDY: ON ESTIMATING BASKET OPTIONS 

In practice, basket options are often valued by assuming that the value of the port- 
folio of assets comprising the basket follows the Black-Scholes dynamics jointly 
rather than that each asset follows the Black-Scholes dynamics individually. After 
estimating the portfolio volatility from the portfolio return, the basket call option is 
valued by substituting the portfolio volatility into the Black-Scholes formula. This 
approach offers a quick solution to traders. However, the risk manager needs to under- 
stand the risk of this simplifying assumption. We examine this approach by means of 
simulation. 

Consider a basket call option with three underlying assets, Sj, S 2 , and S 3 . The 
payoff of this option is max^ + S 2 + S 3 — K, 0). In other words, the holder of the 
option has the right to purchase the portfolio as a sum of the three assets for a fixed 
value of K. Suppose that the current time is t = 1, and we observe the prices of three 
assets since t = 0. Figure 10.2 depicts the paths of the three simulated asset prices. 

Att = 1, the asset prices are 5j = 142.69, S 2 = 89.23, and S 3 = 49.73. The current 
portfolio value is the sum of three assets and equals 281.65. On the basis of the three 
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asset price paths, the portfolio volatility is estimated to be 0.280. Consider the basket 
option with a strike price of 250, maturity of half a year, and interest rate of 5%. The 
naive application of the Black-Scholes formula produces a value for the option as 
44.81. 

On the other hand, we can use MC (Monte Carlo) simulation to estimate the option 
price by assuming individual asset follows the Black-Scholes dynamics. By exam- 
ining the asset price paths, we estimate the variance-covariance matrix for assets 
returns as 

0.172 0.050 0.043 

0.050 0.088 0.038 . 

0.043 0.038 0.123 
V 

Then, we simulate asset prices at t= 1.5 using the Cholesky decomposition for 
10,000 times. Figure 10.3 illustrates the idea of generating asset values at t = 1.5, 
the maturity of the option. Terminal values of individual assets are simulated on the 
basis of an approach similar to Equation 10.3 with three assets. The option price 
is then evaluated by discounting the sample mean of the option payoff using the 
interest rate of 5%. The simulated option price is 51.35, which is larger than the 
naive approach of 44.81. 

We are also interested in the contribution of the error in estimating parameters of 
the option. We perform a control experiment assuming that the variance-covariance 
matrix can be estimated without error. Input the variance-covariance matrix as 

' 0.1600 0.0360 0.0420 ' 

0.0360 0.0900 0.0315 . 

0.0420 0.0315 0.1225 
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Time 

Figure 10.3 Simulating terminal asset prices. 


Using the same set of independent normal random numbers, we obtain the option 
price as 50.71. It appears that the error of estimating the variance-covariance matrix 
does not contribute too much to basket option values. Therefore, the MC and the 
naive approach to valuing basket options can produce significantly different results, 
irrespective of the estimation error. 

In practice, banks and financial institutions usually have a lot of derivatives posi- 
tions in their portfolio. Risk managers are responsible to check for the consistency 
of models that are used to value individual derivatives in the portfolio. Imagine a 
situation that a bank buys and sells options on individual assets and a basket of 
assets everyday. When individual assets are assumed to follow the Black-Scholes 
dynamics, it is crucial for the risk manager to realize what kind of assumptions have 
been imposed. The simulation shows that it is not appropriate to assume the port- 
folio constituting the basket to follow Black-Scholes dynamics jointly because this 
assumption is not consistent with the assumption on individual assets. In such a case, 
the value of basket options can be significantly underestimated. 


10.4 DIMENSION REDUCTION 

For an n-asset option, simulation can be constructed by using the Cholesky decom- 
position (Eq. 6.6). However, this requires generating n independent normal random 
variables for each scenario. To reduce the computational burden, we can use the prin- 
ciple component analysis (PCA) to approximate the n factors by a smaller number of 
factors, usually less than 10 in practice. 
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Suppose that we have an n dimensional random vector X ~ N(0, E) where E is an 
nXn variance-covariance matrix. PCA for normal random variables is to approxi- 
mate X by Y, which follows a distribution similar to that of X but is easier to simulate. 

PCA uses the eigenvalue decomposition in Chapter 6 to approximate the ran- 
dom vector X. Let A { , A 2 , ■■■ , A n be eigenvalues of E and v l , v 2 , ... , v n be the cor- 
responding eigenvectors. As variance-covariance matrices are positive definite, their 
eigenvalues are all positive real numbers so that the corresponding squared roots are 
positive real numbers. Theorem 6.6 asserts that the random vector 

X = \f~Alv jZj -)- \!~A 2 v 2 Z 2 + • • • + \pA n v n Z n , (10.4) 

where Z 1? Z 2 , . .. ,Z„ are i.i.d. standard normal random variables. The equality (Eq. 
10.4) is defined in the sense of distribution. In PCA, we arrange eigenvalues in 
descending order such that A { > A 2 >■■■ > A n . From Equation 10.4, we see that the 
contribution of the term \j~A i v i Z i to the value of X decreases with the index i. The 
eigenvector v i is called the /th principle component (PC). To approximate X, we 
truncate the sum in Equation 10.4 such that 

X ~ \[k x V x Z x + VW 2 + • • • + \PKnV m Z m , 


where m < n. If we are comfortable with this approximation, we then simulate m 
independent standard normal random variables Z, and calculate everything on the 
basis of this approximation. 

An important topic in PCA is to determine the value m. The number of terms 
used in the approximation depends on the accuracy of the outcome required by the 
modeler. If the user requires 100% accuracy besides simulation error, then he should 
use formula of Equation 10.4. PCA is useful when the user requires an accuracy that 
is less than 100%. Suppose he requires an accuracy of at least 99%. Then, m is the 
minimum integer such that 

i K 

— > 99%. 



A proof of this result can be found in standard texts in multivariate analysis, for 
example Anderson (2003). 

Let us apply PCA in multiasset option pricing. Consider an option with 10 under- 
lying assets. Each asset follows the Black-Scholes dynamics such that 


dSj = n i S i dt + (7,5) dWj, i = 1, 2, . . . , 10, 

where 5, is the value of the /th asset. The W t , W 2 , ... , W l0 are correlated Brownian 
motions with correlation matrix: 
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1.00 

0.74 

0.34 

- 0.08 

0.05 

- 0.74 

0.04 

- 0.12 

0.81 

0.82 

0.74 

1.00 

0.81 

- 0.04 

- 0.57 

- 0.25 

0.06 

0.47 

0.89 

0.92 

0.34 

0.81 

1.00 

- 0.17 

- 0.83 

0.20 

- 0.09 

0.78 

0.65 

0.72 

- 0.08 

- 0.04 

- 0.17 

1.00 

0.01 

- 0.05 

0.94 

- 0.04 

- 0.09 

- 0.05 

0.05 

- 0.57 

- 0.83 

0.01 

1.00 

- 0.55 

0.00 

- 0.94 

- 0.41 

- 0.45 

- 0.74 

- 0.25 

0.20 

- 0.05 

- 0.55 

1.00 

- 0.16 

0.65 

- 0.40 

- 0.40 

0.04 

0.06 

- 0.09 

0.94 

0.00 

- 0.16 

1.00 

- 0.06 

0.04 

0.06 

- 0.12 

0.47 

0.78 

- 0.04 

- 0.94 

0.65 

- 0.06 

1.00 

0.31 

0.34 

0.81 

0.89 

0.65 

- 0.09 

- 0.41 

- 0.40 

0.04 

0.31 

1.00 

0.91 

0.82 

0.92 

0.72 

- 0.05 

- 0.45 

- 0.40 

0.06 

0.34 

0.91 

1.00 


A discrete approximation to the asset price dynamics is 

A Sj = rSj At + S i '^Ate i , 

where c, are risk factors such that [c ; | = X ~ N(0, E), and E is the correlation matrix 
given previously. 

For the given correlation matrix, eigenvalues are obtained as 4.719, 2.843, 1.931, 
0.147, 0.104, 0.079, 0.062, 0.056, 0.038, 0.022. Summing up all the eigenvalues 
gives a value of 10. When we divide the sum of the first three eigenvalues by the 
total sum, the ratio is close to 95%. Therefore, if we accept an error of 5%, the first 
three PCs provide sufficient accuracy. Eigenvectors corresponding to the first three 
PCs are found to be: 


0.31 

0.44 

0.41 

- 0.05 

- 0.31 

- 0.07 

- 0.00 

0.27 

0.42 

0.43 

0.41 

0.08 

- 0.22 

0.09 

0.41 

- 0.57 

0.14 

- 0.45 

0.18 

0.16 

0.09 

- 0.03 

0.01 

- 0.70 

0.12 

- 0.05 

- 0.69 

- 0.10 

0.02 

- 0.00 


On the basis of the first three PCs, we generate three independent standard normal 
random variables, namely Z { , Z 2 , and Z 3 , and approximate the n risk factors by 

to] — + z 2 \pZi v 2 + z 2s \fh i v 2 . 

The 10 risk factors e l ,e 2 , ... ,Cio are reduced to only three independent factors. 
Hence, we reduce a 10-dimensional problem to a three-dimensional problem. 

Example 10.2 Value a maximum option on 10 assets with a strike price of $95 and 
a maturity of half a year. Ail asset values are currently $100 with volatilities of 30% 
for all assets. The correlation matrix of risk factors is given previously. The interest 
rate is 4%. We accept a maximum error of 5%. 

The option payoff is max[max(5' 1 , S 2 , ... , S 10 ) — 95,0]. As the option is traded 
in European style, it is efficient to simulate terminal asset values directly. By Ito’s 
lemma, we know that the terminal value of the ith asset is given by 


SfT) = SfO) exp 
= Sf 0) exp 


(r-af/DT + a^fT)} 
(r-'7t/2)T + <j i VTe i 


(10.5) 
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where the vector [e ; ] ~ N(0, Z). Our simulation obtains the option price to be 49.12. 
See the online material for the VBA codes. 


10.5 EXERCISES 

1. Suppose that x(t ) and y(t) are two correlated Ito’s processes such that 

dx = a{t, x ) dt + b{t, x) c/VT , , 
dy = a(t, y ) dt + f)(t, y) dW 2 , 

E{dW { dW 2 ) = pdt. 

Consider a function, f(t, x, y), which depends on both stochastic variables of 
x(t) and y(t). By modifying the proof of Theorem 4.1, show that the dynamic of 
fit, x, y) is 


df = 


d f ' df df b 2 d 2 f p 2 d 2 f d 2 f \ 

Tt +a Tx +a Ty + ^M + i:W +pbl 3 ted-y) 

df df 

+ bf-dW,+ p dW 2 . 
dx dy 


dt 


(10.6) 


This formula is known as the Ito’s lemma for two variables. 

2. Answer the following questions by considering the property of martingales 
defined in Question 7 of Chapter 5. 

(a) Consider a pair of asset price dynamics under the risk-neutral measure: 


dSy = rSy dt + dW^ 
dS 2 = rS 2 dt + a 2 S 2 dW 2 , 

E(dW ] dW 2 ) = pdt. 

Show that the stochastic process X(t) = .S’ , (t)/S 2 (t) is a martingale under the 
Brownian motions W* (t) and Wf(t) where 

W*(f) = Ik, (f) - pa 2 t and W*(t) = W 2 (t ) - a 2 t. 

(b) Under (a), show that X(t) has the dynamics: 


— = a dW*, 
X 


where W* is a Brownian motion and 


a 2 = c 2 — 2 / 7 ( 7 , a 2 + a 2 - 
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(c) Consider a function of .S’, and S 2 , V(t, .S’, , S 2 ), which has the property that 


V(t,S u S 2 ) = S 2 U(t, S ] /S 2 ) = S 2 U(t,X). 


Show that U{t,X ) is a martingale under Brownian motions W* (t) and W*(t). 

3. Consider the exchange option with payoff maxes'! (T) — S 2 (T), 0). Denote the 
option pricing formula for this option as V ex (t, S { , S 2 ). By using the no-arbitrage 
argument, one derives that the exchange option has the properties: 

• There exists a function U such that V ex (t, S l ,S 2 ) = S 2 U(t, .S' , /S 2 ). 

• There exists a probability measure Q such that X(t) = .S) (t)/S 2 (t) is a martin- 
gale. 

On the basis of these properties and the results obtained in Question 2, show that 

= S!<b(<) - s 2 ®(dp, 


where 


.* _ \°g(SJS 2 ) + a 2 {T-t)/2 

‘ o\Jt — t 

d*2 = d* — (yy/r — t, 

a 1 = a 2 — poxC> 2 + a 2 . 

Herein, and a 2 are volatilities of S| and S 2 , respectively, and p is the correlation 
coefficient between the returns of two assets. This formula was first derived in 
Margrabe (1978). 

4 . Run the simulation program for pricing exchange option and compare the numer- 
ical result with the analytical one. 

5. The so-called geometric basket option has the payoff function 


max - 



- K, 0 • . 


(a) Show that this option has a value less than the usual basket option with payoff 

max i - V 5,(7’) - K, 0 

l n ti 

(b) Suppose that individual assets follow the Black-Scholes dynamics. Derive the 
analytical pricing formula for the geometric basket option. 
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(c) By regarding the price of the geometric basket option as a control variate, 
simulate the price of the usual basket option that depends on four assets with 
the following correlation matrix: 


" 1.0000 

0 

0.3000 

0.3000 

0 

1.0000 

0.4000 

0.2000 

0.3000 

0.4000 

1.0000 

0.3000 

0.3000 

0.2000 

0.3000 

1.0000 


We assume all assets sharing the same volatility of 30%, and each asset indi- 
vidually follows the Black-Scholes dynamics. 

6. Use simulation to determine the value and early exercise policy of American style 
exchange options. We assume the interest rate of 5%, .S' , = 100, S 2 = 95, 7 = 1 
year, and the variance-covariance matrix of asset returns: 

/ 0.016 0.006 \ 

^ 0.006 0.09 ) ' 

Hint: you may use the Least Squares model and a quadratic polynomial of 5’, and 
S 2 in your regression. 

The solutions and/or additional exercises are available online at http://www.sta. 
cuhk.edu. hk/Book/SRMS/. 
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INTEREST RATE MODELS 


11.1 INTRODUCTION 

Fixed income securities are concerned with the valuation of promised payments at a 
future date. For example, a zero coupon bond promises to pay a single payment on the 
maturity day. A straight U.S. Treasury bond promises to make payments, the amount 
and date of which are determined by the face value, maturity date, and coupon rate 
of the bond. Because cash flows are certain, we are not concerned with the risk of the 
volatility of the amount of cash. Instead, we are interested in the following question: 
How much would a rational individual be willing to pay today for a promised payment 
in the future? The answer to this question is related to the movement of the interest 
rate, which leads to the next question: What is the best way to manage the interest 
rate risk? Simulation can serve as a useful tool in answering these questions. 


1 1.2 DISCOUNT FACTOR AND BOND PRICES 

Consider the simplest case in which a zero coupon bond (zero) will pay $1 a year 
from now. What is the maximum that one should be willing to pay for this contract 
today? Purchasing this bond should be worth at least as much as putting the money 
in the bank. Let P be the payment at the current moment. Then, 

P(l+R) = 1, 


Simulation Techniques in Financial Risk Management, Second Edition. Ngai Hang Chan and Hoi Ying Wong. 
©2015 John Wiley & Sons, Inc. Published 2015 by John Wiley & Sons, Inc. 
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where R is the current annual interest paid by a bank (R is supposed to be a constant). 
That is, 

P = — i — . 
l+R 

P and are known as the zero price and the discount factor, respectively. 

Now we define P(t, T ) as the zero coupon bond price at time t with maturity at time 
T. A typical bond will pay coupons at semiannual intervals and a principle payment 
at maturity. Figure 11.1 illustrates the cash flow of a 3-year coupon bond. The key 
to evaluating such bond is to view the amounts promised at different future dates 
as separate zero coupon bonds. We then value each payment at each date using the 
discount factor for that date and sum up the values. Let P(t, T ) be the corresponding 
coupon-bearing bond price with a coupon rate C(f,-) paid at each coupon payment date 
tj, i = 1, ... ,7V. Then, P(t, T) can be valued by the formula: 

N 

mD = J j c(t i )P(o,t i ). (li.D 

i=l 


For example, the value of a bond paying a semiannual coupon is given by 


q 1/2) C(l) C(N) + 

1 + R/2 (1 + R/ 2) 2 (1 + R/2) 2N 


To simplify the mathematics, we define r as the continuously compounded interest 
rate. Its relationship with the annual interest rate R is given by the formula 


1 

l+R 


= e 


In reality, interest rates are not constants but change over time. From now on, we 
assume that the continuously compounding interest rate r is a function of time t, that 
is, /• = r r , and we call it the instantaneous interest rate. Suppose that we invest $1 in 
the money market account B{ 0) today with the interest rate r t , then the interest will 


0 0.5 1 1.5 


2.5 


Figure 11.1 Cash flow of a 3-year coupon bond. 
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rollover continuously at every instant by B(t)r, dt. At any time t, the money market 
account Bit) satisfies 


dB(t ) = B(t)r, dt. 

Solving the aforementioned differential equation, with the initial condition 5(0) = 1, 
we obtain 


B(t) = e fo r ° ds . (11.2) 

Conversely, if a bond pays the holder $1 at future time t , the bond is worth 1 /Bit) = 
e~ fo r s ds dollars at time 0, and we will use it as the discount factor for future cash 
flow. So, for a deterministic interest rate r t , the time t zero coupon bond price 5(r, T) 
with maturity at T is given by 


P{t,T) = e~£ r ° ds . (11.3) 

Clearly, 5(5, 5) = 1. In the following section, we extend the interest rates to be 
stochastic. Practically, the zero coupon bond price is expressed in terms of the 
continuous yield to maturity 5(r, T) by 

P{t,T) = e~ R{f ’ T)(T ~ ,) . (11.4) 

This yield to maturity corresponds to the constant interest rate of the continuously 
compounded interest rate from time t to T and can serve as an indicator of the price 
of the bond. If we are given the zero coupon bond prices from the market, the yield 
can be recovered by 


log 5(r,5) 

R(t,T ) = 5 . 

T-t 


(11.5) 


Similarly, for a coupon-bearing bond, 5(0, T) and 5(0, T) are related by 
5(0, T) = C(r 1 )e _ff(0 ’ ri)fl + C(t 2 )e- R(0 ’ t2) ' 2 + • • • + (C(t N ) + 


where C(r,) is the coupon paid at time t r i = 1 , ... ,1V. 

Bond markets usually quote the yield in place of the interest rate r t . Bond prices are 
available only for some discrete times to maturity T h i = 1 . ... ,N, such as 1-, 3-, and 
5-year, so it is more convenient if we parametrize 5(f, T ) as a piecewise continuous 
function and interpolate all of the discrete points to obtain a continuum of R(t, T) for 
all T > 0. 

For example, 5(0, T ) can be parametrized by a piecewise smooth cubic function 
as follows: 


5(0, T) = 


a 0 + b 0 T + c 0 T 2 + d 0 T\ for T e [0, 5 0 ], 
ci^+bj + cj 2 +dj 3 , for5 e [Tq,^]. 


( 11 . 6 ) 
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We can then use interpolation methods, such as cubic spline, to find the coefficients 
by putting the market bond data into formula 1 1.5. Further discussions about yields 
and interest rate models can be found in Jarrow (2002). 

Example 11.1 Suppose R( 0, T) is parametrized as 

f 0.005 - O.OOir - 0.0001T 2 + 0.0005r 3 , for T e [0, 2], 

«(0, 1 ) - | Q 0078 _ 0.0052r + 0.0027 2 + 0.00015r 3 , for T e [2, 3], 

then it is first-order continuous. Figure 11.2 shows the graph of R( 0, T) in Excel for 
0 <T <3. The corresponding discount curve P( 0, T) can also be obtained easily 
from Figure 11.3. 

For a 3-year coupon bond with a notional value of $100 and with a coupon rate of 
6% paid semiannually, the price is given by 


P( 0, 3) = 1 00 [0.03e~ K(0 ’°' 5)0 ' 5 + 0.03e“ mi)1 + • • • + (0.03 + l) e -* (0 ’ 3)3 ] 
= 113.537. 

This Excel file can be downloaded online. 
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Figure 11.2 Yield to maturity R( 0, T). 
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Figure 11.3 Discount curve P( 0, 


11.3 STOCHASTIC INTEREST RATE MODELS AND THEIR 
SIMULATIONS 

Deterministic interest rate models are inadequate for capturing interest rate move- 
ments, as the future interest rates cannot be known for certain. A better approach is to 
incorporate the stochastic feature of the interest rates. A stochastic interest rate model 
should match Equation 1 1 .3 when the stochastic component is absent. A natural way 
is to consider 


P(t , T) = E 


- fj 


(11.7) 


where W s . is a vector of stochastic factors, and T t is the filtration generated by { 1T S , t > 
s > 0 } . Intuitively, T t consists of all of the information available up to time t. Given 
that at current time t, the future interest rates from time t to T are all random, we need 
to take the expectation conditional on the current information. If stochastic factors are 
absent, the function inside the expectation becomes deterministic and the expectation 
is equal to the function itself. 

For pricing derivatives with stochastic interest rates, the future cash flows are dis- 
counted using the zero coupon bond price from Equation 1 1.7 under the risk neutral 
expectation. There are many ways to model the interest rate movement. In this case, 
we consider the short rate models, in which the instantaneous interest rate r t is spec- 
ified by a stochastic differential equation (SDE). 
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From a simulation perspective, expression 11.7 offers a means to conduct Monte 
Carlo simulations. Once an appropriate stochastic interest rate model, such as the 
Vasicek model of Vasicek (1977), the CIR model of Cox, Ingersoll and Ross (1985), 
the Ho-Lee model of Ho and Lee (1986), and the Hull- White model of Hull and 
White (1988), is formulated, simulations can be conducted. 

To illustrate this idea, consider a short rate model that follows 


dr t = /i(t, r t ) dt + j)(t, r t ) dW r , 


( 11 . 8 ) 


where r t is the current continuously compounded interest rate and W t is a Wiener 
process. For example, the Vasicek model assumes that r ) = a(b — r) and fS(t, r) = 
a, whereas the CIR model uses the same //(t, r) with /J(r, r) = a \fr. Sample paths 
for short rate models in the form of Equation 11.8 can be generated by the following 
steps: 

Step 1: Set r t = r 0 to be the current market rate. 

Step 2: Generate e ~ N(0, 1). 

Step 3: Set r j+l = r i + ^(t,-, r ; ) A t + p(t h r ,) ey/Xt. 

Step 4: Go to Step 2. 

Let r® = { r^Ht) : t = 0, i, -, . . . , T] be the j-th interest rate path out of M sample 
paths generated by the preceding algorithm with At = - . By means of quadrature, we 
can make the following approximation: 



If we take At = — , the zero coupon bond price /TO, 1) can approximated by 



In general, we write 



M 



(11.10) 


HULL-WHITE MODEL 


167 


11.4 HULL-WHITE MODEL 

Although many short rate models have been proposed to model the dynamics of inter- 
est rates, we illustrate the pricing of zero coupon bonds and calibration of the model 
parameters under the Hull and White (1994) model. This model admits the analyt- 
ical bond price formula and can therefore simplify the pricing of other exotic fixed 
income derivatives. The instantaneous interest rate r t is assumed to follow the SDE, 
as follows: 


dr, = [Q{t) - ar t )]dt + a dW t , (11.11) 

where 0{t) is a deterministic function of time, a and a are constants, and W, is a 
Wiener process. Applying Ito’s lemma to e‘“ r , , we have 

d(e at r t ) = e(t)e a ' dt + e at c dW r 

Rewriting the aforementioned equation into the integral form with some simplifica- 
tions yields the following representation of r: 

r T = r t e~ a(J - ,) + e(T)e~ a{T - T) dr + c> e~ a{T - r) dW T . (11.12) 

The following fact is useful in interest rate modeling. For a deterministic function 

y(t), let 

Of) = [ >’ 0 ) dW s , 

Jo 

then 7(f) is a normal random variate with a mean of 0 and variance y 2 (.v) ds. The 
proof is in Exercise 1. Therefore the interest rate r is normally distributed. By Ito’s 
identities, see Exercise 1(d) in Chapter 4, the conditional expectation and variance of 
r T given at time t are 


and 


E[r T \r t ] = r,e- a(r - r) + 0(r) e - a(r - T) dr. 


(11.13) 


Var(r r |r,) = E 


°t 


e- a(r ~ T) dW , 


/: 


■ j e- 2a( - r - T) dr 


— [1 - e- ZaU 
2 a 


(11.14) 
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.. / 

To derive the analytical formula for a zero coupon bond, we need to evaluate J r z dr. 
By changing the order of integration, 


/ I r 1 r 1 nil 

r z dr = J r l e~ aiu ~ 1 ' 1 du + J J ^ 6(r)e~ a ^ u ~ T ^ dr du 

+ J j £-«(“— f> dW z du 

J J 8( r)e~ a( "~ r) du dr 


1 _ e -«T-t) 


e 

a 

t r T 


+ 


It 


e - a(u - T) dudWr- 


1 _ e ~a(T-t) 


I, 


+ / 0(r) 


1 — 


d t 


+ 


r T 

J, a 


- Q(r - T) ] dW z . 


(11.15) 


rT 

Therefore, J r z dr is still normally distributed with the mean and variance given as 
r T ( 1 _ e -a(T-t) \ rT / J _ -a(T-r) \ 

l r ' dTr '\ =r {-^—) + L m {-^—) dT (iu6) 


and 


Var^ y r z dr r^J= ^ ^[1 - e" a(r - r) ] 2 ^T 


2^3 


2a(T — ?) — 3 + 4e 


-aO’-f) _ e -2 a(T-r) 


(11.17) 


The moment-generating function of a normal random variable X is given by 


E[e uX ] = exp <( m£[Z] + y Var(Z) 


} 


The zero coupon price is 


P{t, T) = E 


r dr 


T, 


= exp {C{t,T)-D{t,T)r,}, 


(11.18) 
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where 


D(t, T ) 
C(t. 


1 _ e -a(T-t) 


•V-f, 


0(r) 


1 _ e ~a(T-r) 


dr 


+ 


4a 3 . 


2 a(T -t)- 3 + 4e- a(r "' ) - e~ 2a{:r - ,) 


Another method of deriving the solution by the PDE approach is provided in the 
exercise. Now we assume that a and a are known. If we want to use the model, then we 
have to calibrate 9(t) to the current market zero coupon bond prices. In other words, 
given P(t, T ) from the market data for different maturities T, we need to express 9{t) 
in terms of P(t, T ), which is more complicated. Therefore, it is more convenient to 
decompose r t by 


r t = a(t ) + x t . 


(11.19) 


and x t follows 


dx t = —ax t dt + a dW t . 


a(t) is a deterministic function that incorporates the information in 9(t), and x t cor- 
responds to the random component driven by W r We take a' 0 = 0 and a(0) = r 0 . To 
simulate r t , we just need to perform the simulation on x t and add the corresponding 
a(t) at each step. Figure 11.4 plots the graph of a simulated path of x t and Figure 11.5 
shows an example of a{t). Their sum leads to the sample path of r t in Figure 1 1.6. 

Note that x t actually follows a degenerate Hull- White model; in fact, it is an 
Ornstein-Uhlenbeck process, where 0(t) = 0 for all t. This property will be useful 



Figure 11.4 Simulated sample path of x t . 
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Figure 11.5 a{t). 



Figure 11.6 r t . 


for evaluating a similar expectation to that in Equation 11.18 in terms of x t . Denote 
P 0 (t, T) and C 0 (t, T) as the corresponding P(t, T ) and C(t, T ) in Equation 11.18 when 
0(t ) = 0, respectively. To express a(t) in terms of P(t, T), we evaluate 


P(t, T) = E 


-ft ^{r)+x z )dr 


T\ 


= e~ f> T “ (T)dT E 


c ,-/ t T x T dr 


r, 


= e -/' T « T)d *P 0 (f, T ). 


( 11 . 20 ) 


FIXED INCOME DERIVATIVES PRICING 


171 


Taking the natural logarithm on both side yields 


/ 


a(r)dT = — log 


P(t, T) 
P 0 (t,TY 


Assume that 1 = 0, and differentiate both sides with respect to T, then we have 


a(T) = 


d . P(0, T) 

— log . 

dT 6 P 0 (0, T) 


(11.21) 


Example 11.2 When R( 0, T) is estimated from the market data using the parametric 
form given in Equation 1 1.6, a(T) can be computed explicitly. 

Substitute R( 0, T) into formula 1 1.21 gives 


a(T) 



e -R(0,T)T-C 0 (0,T)+D(0,T)x 0 


a 0 + 2b 0 T 4- 3 c 0 T 2 + 4 d 0 T 3 + P-r(l - 2e~ aT + e ~ 2aT ), for T e [0, T 0 ], 

2 af 

a { +2bJ + 2>cJ 2 +4dJ 3 + -^r(l -2e~ aT + e~ 2aT ), for T e [7’ 0 ,7’ 1 ]. 

2a 2 


Knowing a(T ) in closed form, the sample paths in the Hull-White model can be 
generated by the following steps: 


Step 1 : Set r, = r 0 be the current market rate and x 0 = 0. 
Step 2: Generate e ~ N(0, 1). 

Step 3: Set x i+l = x t — aXjAt + cre^fKt. 

Step 4: Set r i+1 = a(t i+l ) + x i+l . 

Step 5: Go to Step 2. 


11.5 FIXED INCOME DERIVATIVES PRICING 

For standard European options, introducing a stochastic interest rate into the 
Black-Scholes model has a minimal effect on prices, so interest rates are usually 
taken to be constant for vanilla stock options. However, for interest-rate-sensitive 
instruments, such as options on bond and range accrual notes, stochastic interest 
rates should be used in the model. 

Consider a coupon bond selling at time T, assuming now is time 0, with a coupon 
payments c ; at T t , i = 1,2, ... ,n and the principal and last coupon payment will be 
paid back at maturity T n . An option on this coupon bond with maturity T and strike 
K can be priced by applying the simulation method in the previous section together 
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with the analytical bond price formula. All of the future cash flow of the coupon bond 
discounted back to time T is 


n 


£ Ci P(T, f) + P(T, T n ). 


By the risk neutral pricing principle and applying formula 1 1 .20, the price of the 
option is given by 



( 11 . 22 ) 


where c,- = c t for i = 1,2, ... ,n — 1 and c n = c n + 1 . The fair price of the bond option 
can be found by taking the following steps: 

Step 1 : Simulate a path of r t for 0 < t < T according to the algorithm in Example 


11 . 2 . 

_ rT , 

Step 2: Calculate the discount factor e Jo r r aT by Equation 11.9. 


Step 3: Evaluate the payoff function in Equation 1 1.22. 

Step 4: Repeat Step 1 M times. 

Although we only deal with European style bond options in this section, there are 
also American and Bermudan bond options in the market. American bond options 
can be exercised at any time within the maturity period, whereas Bermudan options 
only allow the holder to exercise at some discrete and prespecified dates. In terms of 
pricing, there are other methods, such as trinomial tree, that can price European style 
options. American and Bermudan options are more difficult to price due to the path 
dependency. Simulation thus offers a simple way to price path-dependent derivatives. 

Example 11.3 Suppose the yield to maturity is parametrized as in Example 11.1 
and a = 10%, o = 1%, r 0 = 0.002. Now consider an option on a 3-year coupon bond 
with a notional value of $100 and option maturity of 1 year. The coupon rate is 6% 
and will be paid semiannually. The strike of the option is taken to be $100. 

The fair price can be found by 



6 


= 6.87 


Please see the online material for the VBA codes. 
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Apart from using the Euler scheme to discretize the SDE of x, in Example 11.2, 
we can also use the exact simulation by solving the SDE explicitly: 

x t = x s e a( - s ~ r) + oe~ at J e aT dW T . (11.23) 

This implies that we can replace Step 3 in Example 1 1.2 by 

Step 3’: Setx I+1 = + y // fr-(1 ~ e n c. 

Readers can compare the performance of these two methods in the exercise. 

A range accrual is a type of structured product in which the payoff is conditional 
on a certain index falling within a predetermined range. The number of coupons the 
holder can obtain is proportional to the ratio of the observed index in the range and 
the number of observation dates. Buyers of range accrual products usually anticipate 
a steady movement of the index for it to be profitable. 

Consider a simple range accrual note with principal P, which depends on the 
3-month yield to maturity R(t, t 4- At), where At = 0.25. For simplicity, the note is 
assumed to have only one coupon payment g at maturity T. Cases for quarterly or 
semiannually coupon rates can be extended easily. Let N be the number of observa- 
tion dates, and [h ] , h 2 \ be the preset range. After a(r) is calibrated to market data, we 
can price the note by simulation. The fair price of the range accrual note is given by 



(11.24) 


As R(t h f, + At) is the yield to maturity that will be available only at the future time 
tj, we need to evaluate it from the sample paths of r r . Express R(t n f, + At) in terms 
of r t , and a(r) as 


R{t h t, + At) = - 


log P(t i ,t i + At) 


u 


L 


At 

r,-+A t 


a( t) dr - C 0 (t ; , t t + At) + D(f ; , t, + A t)x t . 


Then, for each future observation date f ; , we will be able to determine whether 
R(tj, tj + A) falls in the range. The simulation procedure can be summarized as 
follows: 


Step 1 : Generate a sample path of r t for / = 0 to t = T. 

Step 2: Calculate the discount factor e Jo ' T by Equation 1 1.9. 
Step 3: Determine the number of R(f,, t, + At) that fall in the range. 
Step 4: Evaluate the payoff. 

Step 5: Repeat Step 1 for M times. 
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Example 11.4 For a 1-year range accrual note with a principal of $100 and a 
coupon payment of 8% that depends on the 3-month yield to maturity R(t, t + 0.25), 
assume there are 52 observation dates and the coupon is accumulated within the 
range of [0.002,0.008] for R(t, t 4- 0.25), and a = 10%, c = 1%, r 0 = 0.002. 

The fair price of this accrual note is given by 


f) r r dry.~f . . 0.08 


52 


100(1 + ^11 


1 { R(i / 52, r/52+0.25)e[0. 002, 0.008] | 


;=i 


102.7. 


11.6 EXERCISES 

1. Let y(.v) be a deterministic function and W s be a Brownian motion; consider 


m 


-f 


S'dW'. 


(a) Using Ito’s lemma on e uI < and Ito’s identities on Exercise 1(d) in Chapter 4, 
show that 

e uI(s) 1 ds. 


E[e u,(l> ] = 1 + \u 2 f S;E[ 
Jo 


(b) Let v = E[e M/ ®|, the moment-generating function of I(t). By differentiating 
the aforementioned equation, derive and solve the following ordinary differ- 
ential equation (ODE): 

dy 1 2 

7, = 2“ s ' y - 

Show that for each t, I{t) is a normal random variable with a mean of 0 and 
variance of ds by the uniqueness of the moment-generating function. 

2. Under the Hull- White model, the zero coupon bond price has the form 

P(t T) = e “%W(r,7>, 

(a) Using the Feynman-Kac formula in Exercise 5 of Chapter 5, show that a(t, T ) 
and [)(t, T) satisfy the following system of ODE: 


da(t, T) 
dt 

a(T , T ) = 0, 

dp(t,T) 
dt 

fir, t ) = o. 


p{UTm)- l -j\t,T)G 2 , 


ap(t, T)+ 1, 
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(b) Solve the system of ODE in part (a) and compare the result with the formula 
11.18. 

3. Suppose the yield to maturity R(t, T) is parametrized as 


R(t, T) = 


a o + b 0 T + c 0 T + d 0 T 2 + for T 

a x +bj + cj 2 +dj z +ej 4 , for T 
a 2 + b 2 T + c 2 T 2 4- d 2 T 3 + e 2 T 4 , for T 


[0,r 0 ], 

VT q ,T x }, 

[TuT 2 1 


Find a(T ) for T G [0, T 2 ] in the Hull- White model. 

4 . Consider the CIR model: 

dr = 0.1(0.05 - r)dt + 0.3 \frdW and r 0 = 0.052. 


(a) Construct and implement a standard Monte Carlo simulation to compute the 
discount factor. 

(b) Use the Vasicek discount factor, which corresponds to formula 11.18 for 
0(t) = c, where c is a constant, as a control variate to improve the simulation 
in (a). (Hint: you may use (7 Vasicek = ff C iR\/^-) 

(c) Compare the difference between two prices on the basis of 1,000 simulated 
prices. 

5. Consider the Ho-Lee interest rate movement: 


dr = 0(t ) dt + a dW, (*) 

where 0(t) = a + e~ ht , a =constant and W is the standard Brownian motion. 

(a) Provide an algorithm to compute B( 0, t) by discretizing (*). 

(b) To price a 5-year bond paying semiannual coupons, you adopted At = 1 /250 
to calculate the integration, and M = 1, 000 to estimate each discount factor. 
What is the minimum size for the random sample used to compute the bond 
price with simulations in (a)? 

(c) Express r, in terms of a, b, <r, h and r 0 based on the algorithm in (b), where 
h = At. Hence, show that 

1 — r 

r t = r Q + at - 1 I -eayt, when h -»• 0, 

b 

where e ~ N(0, 1). 

(d) Modify the approach in Section 1 1 .4 to derive a closed form solution for the 
discount factor under the Ho-Lee model. 

6. The dynamic of x t is given by 


dx t = — ax t dt + a dW t . 


(11.25) 
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(a) Using Ito’s lemma on e at x t , derive Equation 11.23, then justify the equation 
in Step 3’. 

(b) Apply the revised algorithm (using Step 3’ to simulate x r ) in Example 11.3, 
and compare the result with the original algorithm. 

The solutions and/or additional exercises are available online at http://www.sta. 
cuhk.edu. hk/Book/SRMS/. 
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MARKOV CHAIN MONTE CARLO 
METHODS 


12.1 INTRODUCTION 

Bayesian inference is an important area in statistics and has also found applications in 
various disciplines. One of the main ingredients of Bayesian inference is the incorpo- 
ration of prior information via the specification of prior distributions. As information 
flows freely in financial markets, incorporating prior information with Bayesian ideas 
constitutes a natural approach. In this final chapter, we briefly introduce the essence 
of Bayesian statistics with reference to risk management. In particular, we discuss 
the celebrated Markov Chain Monte Carlo (MCMC) method in detail and illustrate 
its uses via a case study. 


12.2 BAYESIAN INFERENCE 

The essence of the Bayesian approach is to incorporate uncertainties for the 
unknown parameters. Predictive inference is conducted via the joint probability 
distribution of the parameters 0 = (0 l ,0 2 ,..., 0 r ), conditional on the observable data 
x = (jtj, ... ,x n ). The joint distribution is deduced from the distribution of observable 
quantities via Baye’s theorem. Many excellent texts have been written about the 
Bayesian paradigm; see, for example, DeGroot (1970), Box and Tiao (1973), Berger 
(1985), O’Hagan (1994), Bernardo and Smith (2000), Lee (2004), and Robert 
(2001), to name just a few. Tsay (2010) provides succinct introduction to Bayesian 
inference for time series. 


Simulation Techniques in Financial Risk Management, Second Edition. Ngai Hang Chan and Hoi Ying Wong. 
©2015 John Wiley & Sons, Inc. Published 2015 by John Wiley & Sons, Inc. 
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The observational (or sampling) distribution f(.x\0) is the likelihood function. 
Under the Bayesian framework, a prior distribution p(9) is specified for the param- 
eter 9. Inferences are conducted on the basis of the posterior distribution n(9\x) 
according to the following identity: 


k(9\x) 


f{x\0)p{9) 
fix) ’ 


where /(a) is the marginal density such that 


fix) = j f(x\9)p(9) dO. 


( 12 . 1 ) 


The probability density function n(9\x) is known as the posterior density function. 
Because x is observed, the marginal density in Equation 12.1 is a constant. It is more 
convenient to express Equation 12.1 as 


n{9\x) oc L(9)p(9), (12.2) 

where L(9) = f(x\ 9) is the likelihood function. One way to estimate 9 is to compute 
the posterior mean of 9, that is, 


9 = 


/ 


9n(9\x)d9. 


(12.3) 


The prior and posterior are relative to the observables. A posterior distribution 
conditional on x can be used as a prior for a new observation y. This process can be 
iterated and eventually leads to a new posterior via Baye’s theorem. We illustrate this 
idea with a concrete example. 


Example 12.1 Suppose that we observe Xj , ... ,x n independent random variables 
each N(p, a 2 ) with p unknown and a 1 known. Estimate p in a Bayesian setting. 


The likelihood function is 
1 


Up) = 


(2jig)"/ 2 


exp 


b “ " )! 


oc exp 




where x is the sample mean of the observation. It appears natural to assume that p 
follows a normal distribution by specifying the prior p(p) ~ N (m, r 2 ), where m and 
r 2 are known as hyperparameters. Substituting this prior into Equation 12.2, we have 


7t{p\x) oc exp 
o: exp 


ix-p) 2 ] 



ip - m ) 2 

2a 1 /yi 

exp 

2t 2 

ip - m ] 

2 



2t i 
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TABLE 12.1 Conjugate Priors 


Likelihood L(9) 


Conjugate Prior p(9) 


Poisson 9 = A 
Binomial 9 = p 
Normal 9 = p, cr 2 known 
Normal 9 = er 2 , p known 


CKa.fi) 

Be(a, P) 
N(/n, t 2 ) 
IG(a , P) 


where 



equivalently, 


p ~ N (nii, r f)- 


The posterior mean p = E(p) = m { is an estimate of p given x. Note that m, tends 
to the sample mean x and r 2 tends to zero as the number of observations increases. 
In most cases, the prior distribution plays a lesser role when the sample size is large. 
Another interesting observation is that the prior contains less information as r 2 
increases. When r 2 — > oo, p(p) oc constant, and n(p\x) = N(3c, a 1 /ri). Such a prior is 
known as a noninformative prior, as it provides no information about the distribution 
of p. 

There are many ways to specify a prior distribution in the Bayesian setting. Some 
prefer noninformative priors, and others prefer priors that are analytically tractable. 
Conjugate priors are adopted to address the latter concern. 

Given a likelihood function, the conjugate prior distribution is a prior distribution 
such that the posterior distribution belongs to the same class of distributions as the 
prior. Conjugate priors and posterior distributions are differed through hyperparam- 
eters. Example 12.1 serves as a good example. Conjugate priors facilitate statistical 
inferences because the posterior distributions belong to the same family as the prior 
distributions, which are usually in familiar forms. Moreover, updating posterior distri- 
butions with new information becomes straightforward, as only the hyperparameters 
have to be updated. 

In the one-dimensional case, deriving conjugate priors is relatively simple when 
the likelihood belongs to the exponential family. Conjugacy within the exponential 
family is discussed in Lee (2004). Table 12.1 summarizes some of the commonly 
used conjugate families. Herein, Be denotes the Beta distribution, G the Gamma dis- 
tribution, IG the inverse Gamma distribution, and N the Normal distribution. 


12.3 SIMULATING POSTERIORS 


Bayesian inference makes use of simulation techniques to estimate the parameters 
naturally. As shown in Equation 12.3, calculating a posterior mean is tantamount to 
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numerically evaluating an integral. It is not surprising, therefore, that Monte Carlo 
simulation plays an important role. The integration in Equation 12.3 is usually an 
improper integral (integration over an unbounded region), which renders standard 
numerical techniques useless. Although numerical quadrature can be used to bypass 
such a difficulty in the one-dimensional case, applying quadrature in higher dimen- 
sions is far from simple. Financial modeling usually involves higher dimensions. 

Monte Carlo simulation with importance sampling simplifies the computation of 
Equation 12.3. As it may be difficult to generate random variables from the posterior 
distribution n{6\x) directly, we may take advantage of the fact that importance sam- 
pling enables us to compute integrations with a conveniently chosen density. Consider 



(12.4) 


where q(6) is a prior specified density function that can be generated easily. Drawing 
n random samples 9 i from q(0), we approximate the posterior mean by 


^ = 1 y I-*) 

n q(0i) 


Note that the importance sampling is not used as a variance reduction device in this 
case; rather, it is applied to facilitate the computation of the posterior mean. The 
variance of the computation can be large in some cases. 


12.4 MARKOV CHAIN MONTE CARLO 

One desirable feature of combining Markov chain simulation with Bayesian ideas is 
that the resulting method can handle high-dimensional problems efficiently. Another 
desirable feature is to draw random samples from the posterior distribution directly. 
The MCMC methods are developed with these two features in mind. 

12.4.1 Gibbs Sampling 

Gibbs sampling is probably one of the most commonly used MCMC methods. It 
is simple, intuitive, easily implemented, and designed to handle multidimensional 
problems. The basic limit theorem of Markov chain serves as the theoretical build- 
ing block to guarantee that draws from a Gibbs sampling agree with the posterior 
asymptotically. 

Although conjugate priors are useful in Bayesian inference, it is difficult to con- 
struct a joint conjugate prior for several parameters. For a normal distribution with 
both mean and variance unknown, deriving the corresponding conjugate prior can be 
challenging. However, conditional conjugate priors can be obtained relatively eas- 
ily; see, for example, Gilks, Richardson, and Spiegelhalter (1995). Conditioning on 
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other parameters, a conditional conjugate prior is one dimensional and has the same 
distributional structure as the conditional posterior. 

Gibbs sampling takes advantage of this fact and offers a way to reduce a multi- 
dimensional problem to an iteration of low-dimensional problems. Specifically, let 
x = (x j , . . . , x n ) be the data and let the distribution of each x t be governed by r param- 
eters, 0 = (0j, 0 2 , ... , 0 r ). For each j = 1, ... , r, specify the one-dimensional condi- 
tional conjugate prior p(0j ) and construct the conditional posterior by means of Baye’s 
theorem. Then iterate the Gibbs procedure as follows. 

Set an initial parameter vector (0®, ... , 0®). Update the parameters by the fol- 
lowing procedure: 

• Sample 0j !) ~ p(0 { |0®, . . . , 0®, x); 

• Sample 0^ ~ p(0 2 \0^, 0®, ... , 0®,jc); 

• Sample 0, (1) ~ p(0 r \0^\ 0\ l \ ... , 0^ , x). 

This completes one Gibbs iteration, and the parameters are updated to 
(0J 1 *, . . . , 0 < ' r l) ). Using these new parameters as starting values, repeat the iteration and 

( 2 ) ( 2 ) 

obtain a new set of parameters (O f , ... , Of). Repeating these iterations M times, 
we get a sequence of parameter vectors 0®, . . . , 0^ M \ where 0^ = (0®, . . . , 0, w ), for 
i = 1 , . . . , M. By virtue of the basic limit theorem of Markov chain, it can be shown 
that the Markov chain {0 (M) } has a limiting distribution converging to the joint 
posterior p(0 1 ,0 2 , ... , 0 r \x) when M is sufficiently large; see Tierney (1994). The 
number M is called the burn-in period. After simulating { 0( M+1 \ q( m + 2 \ . . . ; g ( M + n ) } 
from the Gibbs sampling, Bayesian inference can be conducted easily. For example, 
to compute the posterior mean, we evaluate 



To acquire a clearer understanding of Gibbs sampling, consider the following 
example: 

Example 12.2 One of the main uses of Gibbs sampling is to generate multivariate 
distributions that are usually hard to simulate by standard methods. We present a 
simple example to generate two correlated bivariate normal random variables 0[ 
and 0 2 , where 



To use the Gibbs sampling method, we construct a Markov chain { 0 (M> } that has a lim- 
iting distribution converging to the bivariate normal distribution p(0 l ,0 2 ). The next 
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step is to find the marginal distribution of 0\ given the value of 0 2 . By the conditional 
distribution formula, we have 


p(fi i\0 2 ) 


iA) 

P(0 2 ) 


2k\J I —p- 


exp - 


\flK 


exp 


e\ -2pe x e 2 +el \ 
2(1— p 2 ) J 



1 / (Ol-P02) 2 \ 

2(1 -a I 

From the above-mentioned functional form of the distribution function, we can 
conclude that, given 0 O , 

eAe 2 ~ N(p0 2 , 1 - p 2 ). 


Similarly, for 6 X , we have 


e 2 \ 9{ ~ N(p0 1 ; i- A 

By taking the initial guess of 0 l ' u to be the mean 0, the normal random variables are 
generated by the following steps: 


Step 1: Set i = 1 and 0^ h = 0. 

Step 2: Generate Zj ~ N(0, 1) and set 6^ = p0 2 ~ l) + \/ 1 - p 2 Z ] . 
Step 3: Generate Z 2 ~ N(0, 1) and set 0^ = p0^ ] + \/ 1 — p 2 Z 2 . 
Step 4: Set i = i + 1. 

Step 5: Go to Step 2 until i equals a prespecified integer M 


Note that 0^ in Step 3 is updated with the new 0 ( ‘ } generated in Step 2. 

We demonstrated how to generate these random variables using the Cholesky 
decomposition in Chapter 6. In this example, using Cholesky is more convenient 
than using Gibbs sampling. Furthermore, to generate a sequence of independent 
bivariate normals, we would have to perform the whole procedure from the begin- 
ning again. This shows that although Gibbs sampling is powerful for dealing with 
high-dimensional problems, it may not be the most efficient method. 

Example 12.3 Let x j , ... ,x n be independent N(p, rr 2 ) random variables with both 
fi and <t 2 unknown. Estimate p and o 2 via Gibbs sampling. 

Recall that the conjugate prior of p is normal for a given o 2 and that the conjugate 
prior of a 2 is inverse gamma for a given p. Let p 0 ~ N(w 0 , r 2 ) and o 2 ~ IG(a 0 , /J 0 ) be 
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random variables drawn from the initial priors. Define /r, and err to be random vari- 
ables generated in the 1th iteration of the Gibbs sampling procedure. The conditional 
posterior for //, can be obtained by mimicking Example 12.1. We have 


d,\ 


N On,-, t 2 ). 


where 


r 2 .x + m-_,a 2 . In 
1—1 11 1—1 ' 

t 2 + i t 2 Jn 

1—1 l— l' 


and rr = 


2 2 

T ,-i g ,-i 

nr 2 + a 2 
1—1 1—1 


(12.5) 


In Question 1 at the end of this chapter, the conditional posterior for a 2 is found to 

• 2 I 

i 'Hi 


be a 2 1 ~ 7G(a ; , /? ; ), where 


a i = n / 2 + a I _ 1 and /?,=/?,_,+ I ^ ( X/ - /r,) 2 . (12.6) 

l=i 

Hence, Gibbs sampling is implemented as follows: 


Step 1: Set i = 1 and initial values of m {) , r 2 , a 0 , /? 0 , and tr 2 . 

Step 2: Sample /(, | ^2 ~ N(m ; , r 2 ) and update a f and /J, by Equation 12.6. 

Step 3: Sample a 2 1 #( . ~ 7G(a,-, /?,) and update m l+ , and rr +| by Equation 12.5. 
Step 4: Set 1 = 1+1. 

Step 5: Go to Step 2 until 1 equals a prespecified integer M + k. 


We keep the last k pairs of random variables for indices M + 1 to M + k. The 
estimation is achieved by taking the sample means: 





j M+r 


12.4.2 Case Study: The Effect of Jumps on Dow Jones 

To appreciate the usefulness of Gibbs sampling, we use it to estimate the parameters 
of a jump-diffusion model and examine the effect of jumps on major financial indices. 
Note that maximum likelihood estimation does not work for this model (Redner and 
Walker, 1984). 

In the jump-diffusion model of Merton (1976), the dynamics of asset returns are 
assumed to be 


d log S = ndt + a dW, + Y dN t , 


(12.7) 
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where S is the equity price, W t is the standard Brownian motion, N t follows a Poisson 
process with an intensity X, and Y is a normal random variable with a mean of k and 
variance of s 2 . We assume that dW t , dN,, and Y are independent random variables 
at each time point t. This model requires the estimation of p, a, X, k, and s based on 
observations { S' 1 , ... ,S n ,S n+l }, where .S', represents the equity price observed at time 
tj. These prices produce n independent log-returns, which are denoted by [X { , . . . , X n } 
where X t = log S j+i — log S r With a fixed At, a discrete approximation to the dynam- 
ics Equation 12.7 is 


A log S = p At + o AW t + Y AN,. (12.8) 

When At is sufficiently small, AN t is either 1 , with a probability of XAt, or 0, with a 
probability of 1 — X At (Fig. 12.1). 

Example 12.4 Simulate 100 sample paths from the asset price dynamics of 
Equation 12.7 with the parameters p = 0.08, a = 0.4, A = 3.5, s = 0.3, and k = 0. 
Each sample path replicates the daily log-returns of a stock over a 1-year horizon. 
On the basis of these 100 paths, estimate the values of p, a, A, s, and k with Gibbs 
sampling. Compare the results with the input values. 


Simulating paths Sample paths are simulated by assuming n = 250 trading days a 
year, so the discretization (Eq. 12.8) has Ar = 1/250. On each path, the log-asset 
price at each time point is generated as follows: 


logS, +1 - log .S’, = 


p At + os/ Ate, 
p At + k + V o 2 At -I- s 2 e. 


if U > A At 
if U < AAt 



Figure 12.1 A sample path of the jump-diffusion model. 
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where e ~ N(0, 1) and U ~ 1/(0, 1) are independent random variables. To simplify the 
notations, we denote x l = log S j+l — log .S', . A graph of three sample paths is given in 
Figure 12.1. 

Gibbs sampling There are five parameters in the model, so we have to develop five 
conditional conjugate priors from their conditional likelihood functions. Let us pro- 
ceed step by step. 

1 . Conditional prior and posterior for p . 

Other things being fixed, the likelihood function of p happens to be propor- 
tional to a normal density. Specifically, 


Up) oc ]~[ exp 
1=1 


-£(*,- T,A/V,) -• 

i=i 


oc exp 


1 

2(7 2 


(x ; - - p At - YjANj} 
2a 2 At 


Therefore, a normal distribution N(m, t 2 ) is suitable for p as a conditional con- 
jugate prior. The posterior distribution can be immediately obtained as 


N 


A 2"=t (•*/ - Y A N i) + mo 1 In r 2 0 2 


t 2 + a 2 /h 


nr 1 + a- 


(12.9) 


2. Conditional prior and posterior for a 2 . 

The conditional likelihood function of a 2 is 


L((7 2 ) oc (a 2 ) exp 


n 

— I— y & - - y anx 

2a 2 At x-t v ' ' 


1=1 


We select IG(a, p) as the conditional prior for ct 2 . Then, the posterior distribu- 
tion becomes 


IG[ a + n/2, p + 


2 /=, {xi-pAt-Y^y 

2At 


( 12 . 10 ) 


3. Conditional prior and posterior for X. 
The conditional likelihood of X is 


L(X) oc (XAtf(l - XA t)”~ N , 
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where N is the total number of jumps in the horizon. From Table 12.1, we find 
that the appropriate conjugate prior is Be(a, b). Simple computation shows that 
the posterior distribution is 


Be (a + N,b + n — N) . 


( 12 . 11 ) 


4. Conditional prior and posterior for k. 

As k is the mean of the normal jump size, its prior and posterior are obtained in 
the same manner as p. We state the result without proof. The prior is N(/7? F , r 2 ), 
and the posterior is given by 


Yj/N + m Y J/N r 2 s 2 \ 
y T 2 + s 2 /N Nt 2 + s 2 J 


( 12 . 12 ) 


5. Conditional prior and posterior for s 2 . 

As s~ is the variance of the normal jump size, its prior and posterior are obtained 
in the same manner as a 2 . The prior is IG(a Y , ji Y ), and the posterior is given by 


IG\ a Y + A/2, p Y + 


£"i ( Y i-kY 


(12.13) 


The aforementioned priors and posteriors are distributions conditional on values of T ; 
and AN/. This complicates the Gibbs sampling procedure because only x f is observ- 
able for all i. Therefore, at each time point f ; , Y t , and A N t should be simulated from 
the distributions conditional on the observed value of x, before substituting them into 
the priors or posteriors. We need the following facts: 

x,|AA, = 0 ~ N(/iA t, (7 2 A I); 
x,|AA, = 1 ~ N(/iA t + k, a 2 At + s 2 ), 


which together with Baye’s theorem show that 


P(ANj = 1 lx,) = 

' 1 PbiWi -- 

P(ANf = 0|x ; ) = 1 - P(AN t 


P(x i \AN l = 1)2 Ar 
l)2Ar + F(x,-|AA, = 0)(1 - 2 At)’ 
= l|x ; ). 


(12.14) 


The jump size K, is necessary only when AA'j = 1 . Under such a situation, we recog- 
nize that the conditional density function of Y { is 


f(Yi\ Xi ) =f(x l \Y i )p(Y i ) oc exp 


(x ; - Y, - p At) 2 
2cj 2 At 


exp 


{Yj-kr 

2 s 2 
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which implies 


Y, k~N 


(x ; — p At) /a 2 At + k/s 1 
\/a 2 At + 1 / s l 


If a 2 At + 1/s 2 


(12.15) 


With all of the ingredients ready, the Gibbs sampling starts by choosing the ini- 
tial values of p 0 , a 2 , k 0 , A 0 , and si. We also need initial values for Y^ 0> and AA <()) , 
both of which can be obtained by a simulation with the initial parameters. The Gibbs 
sampling runs as follows: 


Step 1: Sample p f ~ p(p } Wj_ v kj_ l ,sj_ l , Aj_ l ), as given in Equation 12.9. 

Step 2: Sample dj ~ p(aj\pj, kj_ l ,sj_ l , 1 ; _| ), as given in Equation 12.10. 

Step 3: Sample Aj ~ p(Aj\pj, a 2 , kj_ t , sj_^), as given in Equation 12.11. 

Step 4: Sample kj ~ p(kj\pj, aj, sj_ v Aj), as given in Equation 12.12. 

Step 5: Sample sj ~ p(sj \ pj, or, kj, Aj), as given in Equation 12.13. 

Step 6: Sample AiV® ~ p{ANj ] \p r a 2 , kj, s 2 ) as given in Equation 12.14 for all i = 
1,2, ...,n. 

Step 7: Sample Yj } ~ p( K® | p p aj, kj, sj), as given in Equation 12.15 for the time 
point tj where A A', = 1 . 

Step 8: Set j = j + 1 and go to Step 1. Repeat until j = M' + M. 


Inference is drawn by taking the sample means of the values of the last M simulated 
parameters. The VBA code is available online in the supplementary document. 


Results and comparisons Table 12.2 shows our estimation results. We report the 
averaged posterior means over the 100 sample paths and the variances. As the table 
shows, the estimates are close to the true values, and the variances are small. Gibbs 
sampling does a good job of estimating the parameters for jump-diffusion models. 

Example 12.4 shows the usefulness of Gibbs sampling in estimating the 
jump-diffusion model. In practice, this application can be crucial for a risk manager 
to assess how much risk is due to jumps. To examine the jump risk empirically, 
we estimate the effect of jumps on the Dow Jones Industrial Index. Our estimation 
is based on daily closing prices over the 1995-2004 period. The parameters are 
estimated on an annual basis. 


TABLE 12.2 Performance of the Gibbs Sampling 




a 2 

A 

k 

s 2 

True value 

0.08 

0.4 

3.5 

0 

0.3 

Mean 

0.0769 

0.3986 

3.8600 

0.0163 

0.2868 

Variance 

0.0233 

6.5xl0 -5 

0.8895 

0.0039 

0.0015 
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TABLE 12.3 Jump-Diffusion Estimation for Dow Jones 


Year 

/< 

(7~ 

2 

k 

j 2 

1995 

0.2871 

0.0901 

1.9035 

0.0627 

0.2608 

1996 

0.2483 

0.1172 

2.818 

- 0.0337 

0.235 

1997 

0.2384 

0.1684 

3.6587 

- 0.0256 

0.2087 

1998 

0.1776 

0.1752 

5.5127 

- 0.0123 

0.1782 

1999 

0.2177 

0.1624 

1.7968 

- 0.0176 

0.2627 

2000 

- 0.0162 

0.1971 

3.3364 

- 0.0235 

0.2157 

2001 

0.015 

0.1951 

4.1797 

- 0.0383 

0.2008 

2002 

- 0.2188 

0.2484 

2.7072 

0.0106 

0.239 

2003 

0.1891 

0.1626 

2.0479 

0.0661 

0.2463 

2004 

0.0351 

0.1111 

1.7561 

0.0004 

0.2788 


In Table 12.3, the number of jumps per year, 2, ranges from 1 .75 to 5.5. Therefore, 
we can have 5-6 jumps in a particular year. The effect of jumps is significant, as 
almost all of the s 2 values are bigger than 0.2. The variances a 2 associated with the 
Brownian motion part of the model are about 0.2 but should be divided by 250 to 
produce the daily variance. When a jump arrives, additional daily variance of 0.2 is 
added to the index return variance : a 2 /250 + s 2 . The additional variance due to a 
jump is relatively large. Jump risk cannot be ignored! This information is useful for 
risk managers to construct scenarios for stress testing. 


12.5 METROPOLIS-HASTINGS ALGORITHM 

In this section, we explain why random draws using Gibbs sampling approxi- 
mate the posterior distribution. To obtain a general result, we first introduce the 
Metropolis-Hastings algorithm in which the Gibbs sampling is a special case. We 
then show that the Metropolis-Hastings algorithm constructs a Markov chain with a 
limiting distribution following the posterior distribution. Further details are given in 
Casella and George (1992), Chib and Greenberg (1995), and Lee (2004). 

Consider a Markov chain { 0 (n> } with a finite state space {1,2 , ,m] and transition 
probabilities p ir Given the transition probabilities, the limiting distribution of the 
chain can be found by solving the following equation: 

m 

x(j) = X n< S)Pij- 

1=1 

When the state space is continuous, the sum is replaced by an integral (Bhattacharya 
and Waymire, 1990). 

In MCMC, we work with a reverse problem. Given a posterior distribution n(J), 
we want to construct a Markov chain whose transition probabilities converge to the 
posterior distribution. If the transition probabilities satisfy the time reversibility with 
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respect to n(j), then its limiting distribution is guaranteed to be equal to To 
explain time reversibility, write the transition probabilities p t j as 

Pij = P* + ''Aj, 

8 U = 1 and <5 /; = 0 for i # j, 

where p* = 0, p* = Pij for i # j, and r, = p u . 

If the equation 


x(i)p*j = n(j)p*i (12.16) 

is satisfied for all i, then the probabilities Pi] are time reversible. This condition asserts 
that the probability of starting at i and ending at j when the initial probability is given 
by 7r(i) is the same as that of starting at j and ending at i. By simple computation, we 
check that 


Y xwpy = Y K ^p*ij + Y K ^ r i 8 ij 
= Y K ^Pji + n( 'j )r i 

i 

= x(j)( I - r j) + ^0>/ 

= ^0')- 


Therefore, n(j) is the limiting distribution of the chain. 

In other words, a Markov chain whose limiting distribution is the posterior dis- 
tribution can be constructed by finding a time-reversible Markov chain. We start 
this process by specifying the transition probabilities q^. If the probabilities q !J have 
already satisfied the time reversibility, then the corresponding Markov chain is the 
one we want. Otherwise, suppose that 

7i{i) qi] > n(j)q fi . 

Then, it has a higher probability of moving from i to j than from j to i. Therefore, we 
introduce a probability a ( - ; to reduce the moves from i to j. We would like to have 

x(i)qjj ajj = it(J)qji, ( 12 . 17 ) 

so that 

a ii = • 

x(i)qij 

As we do not want to reduce the likelihood of moving from j to i, we set a /7 = 1 . 
Therefore, the general formula is 
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From Equations 12.17 and 12.18, we see that the transition probabilities 

Pij = for < # 7. 

Pu= ( 12 - 19 > 

i 

are time reversible with respect to n(i) and hence define a Markov chain whose lim- 
iting distribution is the required one. This method is called the Metropolis-Hastings 
algorithm. 

Example 12.5 Consider a random walk Markov chain: 


A 


B 


C 


D 


All transition probabilities are 0.5, except that the transitions “from A to B” and 
“from D to C” are 1. The transition matrix of the chain is given by 



A 

B 

c 

D 

A 

0 

1 

0 

0 

B 

0.5 

0 

0.5 

0 

C 

0 

0.5 

0 

0.5 

D 

0 

0 

1 

0 


On the basis of the Metropolis-Hastings algorithm, construct a Markov chain whose 
limiting distribution is i, i, i). 

A simple calculation shows that the limiting distribution of the original Markov 
chain is (i, i, - , i). To construct the desired Markov chain, we need to compute 
probabilities a lf . For instance, 


a AB = m i n 


/ K QB) P(A\B) \ 
V ’ tr(A) P(B\A) ) 


= min 


■ (>)/ 


i 

2 ' 


This means that the transition probability “from A to B” is reduced from 1 to 
1 X ; = For node “A,” the remaining transition probabilities correspond to the 
event that no transition occurs. Transition probabilities for the other nodes are 
obtained in the same manner. The final transition matrix becomes 



A 

B 

C 

D 

A 

0.5 

0.5 

0 

0 

B 

0.5 

0 

0.5 

0 

C 

0 

0.5 

0 

0.5 

D 

0 

0 

0.5 

0.5 


It is easy to verify that the limiting distribution of this Markov chain is ] - f 
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To apply the Metropolis-Hastings algorithm for simulating a random variable 9 
with the distribution n(0), we begin with any Markov chain X k whose transition den- 
sity q(X k | X k _ , ) is easy to simulate and with a range similar to that of 9. For this 
Markov chain to have the desired limiting distribution k(0), we need to adjust the 
transition density q(X k \X k _ x ) at each step k of the algorithm according to the updating 
criteria in Equation 12.19 so that it is time reversible. That is, if the transition proba- 
bility from state X k _ x to state X k is too high, we reduce its probability by a amount, 
then the new transition probability p(X k \X k _ j) will form a time-reversible Markov 
chain with a stationary distribution of k(0). The algorithm can be summarized as 
follows: 

Step 1 : Choose a transition probability q to construct the Markov chain X k . 

Step 2: Pick an initial value for 9 0 and X 0 and set k = 1. 

Step 3: Simulate X k according to the probability law of q(X k \X k _j). 
q(X ,._ i |Xj.);r(X.) 

Step 4: If a = — — - — - > 1, set 0 k = X k and go to Step 6. 

9V^k\^k-V K ^k-V 

Step 5: Otherwise, generate W ~ U[0, 1]. If IT < a, set 6 k = X k , otherwise set 0 k = 
9 k _i andX A . = X k _ x . 

Step 6: Set k = k + 1 and repeat Step 2 until k is equal to a prespecified integer M. 

Example 12.6 In the previous chapter, we showed how to generate a normal 
random variable, using the acceptance-rejection method, for example. In this 
section, we demonstrate how a normal random variable can be generated by the 
Metropolis-Hastings algorithm. Let 9 ~ N(0, 1). We need to construct a Markov 
chain that has a limiting distribution equal to a normal distribution. 

Let X k be a stochastic process such that for each k = 0, 1,2, .... X t is a double 
exponential random variable; that is X k ~ DoubleExp(l) with pdf, as follows: 

P (x k ) = ±e-'**K 

Given the memoryless property of the double exponential. 


P(X k+l \X k ) = P(X k+l ), 


it can be considered as a subclass of a Markov chain because the current state is 
independent of all previous states. It takes a value from negative infinity to posi- 
tive infinity, making it a good candidate to approximate the normal random variable. 
The X k is constructed as the initial distribution, and the transition probability will 
be adjusted according to the Metropolis-Hastings algorithm to transform it to a time 
reversible Markov chain as follows: 


Step 1 : Set k = 1, X 0 = 0 and 9 0 = 0. 
Step 2: Generate U , V ~ U[0, 1], 
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Step 3: If V > ^ , set X k = — In U, otherwise set X = In U. 

x l 

e —T ■ e-ft-il 

Step 4: If a = > 1, set 9 k = X k and go to Step 6. 

x i- i 

e~2 . e ~\ x k\ 

Step 5: Otherwise, generate W ~ U[0, 1]. If W < a, set 9 k = X k , else set 9 k = 9 k _ ] 
and.K',. = X k _ l . 

Step 6: Set k = k + 1 and repeat Step 2 until k equals a prespecified integer M. 

The VBA code is available online on the book’s website. The following 
theorem justifies that the Gibbs sampling algorithm is a special case of the 
Metropolis-Hastings algorithm. 

Theorem 12.1 Gibbs sampling is a special case of the Metropolis-Hastings algo- 
rithm in which every jump is accepted with a = 1. 

Proof. Suppose that there are r parameters, that is, 9 = (0 l5 ... ,9 r ), in the model. 
We want to generate 9 ~ k(0) for a given k(-). Let 9 ({]> be the initial state of 9. We 
generate a sequence of vectors by Gibbs sampling: 

0(0) 0(1) 0(2) 0(3) ^ 0(n) 0(n+l) ^ 

where 9 ( " ) and 0 (n+] > only differ in one component. This sequence of vectors evolves 
according to the conditional density given by the Gibbs sampling algorithm. For 
example, the transition density from 9 (k> to 0 (k+v \ where k < r, is governed by 
the conditional density p(9 k \9 l ,9 2 , , 9 k _ x , 9 k+l , ... , 9 r ). This is a Markov chain 
because the conditional density depends only on the previous state; in fact, only on 
r — 1 components, (0j ,9 2 , ■■■ , 9 k _ l ,9 k+l , ... , 9,.), of the previous state. Now suppose 
that 9 Uk> and t/" +l 1 differ in the first component: 

0("+i )=(> +l >,0« 0<"), ...,0W), 

where 9 ( '" +l) is drawn given (^9 ( "\ 9 k "\ ... , 9^ n) ^j. The transition density from 6 (n) to 

9 (n+i) j s gj ven ^ t| lc conditional probability density of 7t(-), given (f" 1 in the Gibbs 
sampling: 

q (6> (n+1) |6» (n) ) = q ((0 ( ” +I) , 9 ( 2 \ 9 ( "\... , 9^) | (6> ( 1 n+1) , 9 ( "\ ... , 

= p = 9 ( f +l) \9 2 = 9f, 6> 3 = 9f \... , 9 r = 0 ( r n)S j . 

The second equality arises because the transition density from 9 (nk> to Q (n+r> does not 
depend on the first component. The Metropolis-Hastings algorithm multiplies the 
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transition density q by a, where 


r^(6» (,i+1) )^(6» ( " ) |6» ( " +1) ) 1 

a = min < — — , 1 >, 

\ n ( 0 W) ? ( 0 (»+ 1 )| 0 (»)) / 

and modifies the original Markov chain to become a time-reversible one. We can 
prove that this Markov chain is time reversible by showing that a = 1. Now, by con- 
ditioning on (62,63, ... ,6,. ) with the marginal probability density Pi(-) of /r(-), we 
expand n and n ( 6 as follows: 


n(0 M ) = n(df,0f,0f,...,0f) 

= p (e l = ef\e 2 = ef, e 3 = of, ...,e,. = of) 
XPI (o 2 = of,8 3 = of,...,8 r = of) 


x(e (n+ f = n(e^ +l \ef,ef,...,ef) 

= p ( 6 >! = 6 [ n+l) \6 2 = Of* = 0 f> — , 9 r = e f) 

x Pi (e 2 = ef,63 = ef, ...,e r = ef). 

Similarly, we have 

q (0W|0(«+D) = q ( ef, ef, ... , of) I (V; +1) , ef, ef , ... , ef)) 
= P (e 1 = ef\e 2 = ef, e 3 = ef, ...,e r = ef) . 


The second equality is still due to the fact that from Q (n+l> to 0 (n \ the only compo- 
nent that differs is the first component, so the transition density is again given by the 
conditional density of n(-) given (0 2 , 63, ... , 0 r ). Comparing the aforementioned four 
equations gives 


x (6» (,!+1) ) q (6» (n, |6> ( " +1) ) = * (0 (,!) ) q {e (n+l) \e M ) , 

so that a = 1 . This simply means that the probability of going from the nth state to the 
(n + l)th state is equivalent to that of going from the ( n + l)th state to the nth state. 
Similarly, we can show that a = 1 from any 0 (n) to 6 {n+[) , with the kth component as 
the differing component. This shows that the Gibbs algorithm is indeed a particular 
case of Metropolis-Hastings with every jump accepted. □ 
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When the conditional distribution of some parameters is not known explicitly, 
we cannot use Gibbs sampling to update the parameters, but we can still use the 
Metropolis-Hastings algorithm to estimate them. The following example demon- 
strates the use of Metropolis-Hastings in a discrete stochastic volatility model. 

Example 12.7 In the following example, we present a case study on a simple dis- 
crete stochastic volatility ( SV) model by using MCMC technique to estimate the model 
parameters. 

Let y t = log 5, — S t _ 1 be the difference of the log-return of stock price between 

time t — 1 and t, h t be the log- volatility at time t, and t = 1,2 n. where n is 

the number of observation. Denote y = (y x ,y 2 , ... ,y n ) and li = (ft 1? ft 2 , ... , ft,,). We 
assume the model follows: 


y t =y/e>he t , (12.20) 

h t+ i=p + rri t , (12.21) 


where h { ~ N(//, r 2 ). c t and r\ t are assumed to be independent and follow normal 
distribution with mean 0 and variance 1 as follows 


e t 

It 



1 0 
0 1 


for all t e N. 

To sample the parameters, one of the possible ways is to perform the Gibbs sam- 
pling algorithm as follows: 

Step 1: Initialize ft <0 \ t 2 and p 0 and set i = 1. 

Step 2 : For t = 1, ... , n, sample ft® ~ p ,y, ft® Y where ft®" 1 ' 1 = 

ft (, - 1) ft (i_1) and ft® = ft® ft® 

Step 3 : Sample ~ p [p] t 2 _ ( , y, ft (,) ). 

Step 4: Sample r 2 ~ p (rr| p t ,y, ft (,) ). 

Step 5 : Repeat Step 2 by setting i = i + 1 for M times. 

By Baye’s rule, we can derive the conditional posteriors as follows 

p(p\r 2 ,y, h)<xp(y\h)p(h\p, r 2 )p(p) and 
p(r 2 \p, y, h,) oc p(y\h) p(h\p. r 2 )p(r 2 ), 

where p(p) and p(r 2 ) are independent priors. In this case, we take p(p) ~ N(a^, /? ;/ ) 
and p(r 2 ) ~ IG(a T , (l T ), where IG{-, ■) denotes the inverse gamma distribution, a (r) 
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and /?(.) are hyperparameters specified by users. To obtain the conditional posterior 
distribution for //, we apply Baye’s rule as follows 

Pin \r 2 ,y,h) oc p(h \p,T 2 ) p(p) 

n 

«n P( h t\P’ T 2 )N(a^,^) 


t= 1 


oc exp < — 


2 T 2 




oc exp 


O - a ,,) 2 


f=i 

7<) 


exp < -- 


(/t - «,f 




2/?l 


where 


h K + v 2 /" 




py 


and h 


n ~ 


“ py^/n ’ nfi„ + * 2 ’ - t=l 

Similarly, the conditional posterior distribution for r 2 can be obtained as follows 
p(r 2 \p,y,h,) ccp(h\p, r 2 )p(r 2 ) 

1 \ "/ 2 -r- 




1 \"/ 2 


(?) 


exp 


1 

2t 2 




t= 1 


(P^re-P*/* 2 


r(a r )(T 2 )“ T +1 


oc exp < — - 


oc IG(a T ,p T ), 


A + i 2><-"> 2 }(?) 


J \(“ t +«/ 2 )+ 1 


where 


= «r + ? and Pz = Pz + \ ~ y~ 

A A t=\ 


To sample /z, from p{h t \p, t, y, h_ t ), we first derive its conditional posterior distribu- 
tion as follows 


/?(/?, I p, t, y, h_ t ) cx p(y t \p,r 2 , h t ) p(h, \p,r 2 , h_ t ) 


1 


\]'lne h t 


exp 


4) exp f-«q£ ). 

2 f \ 2r 2 / 
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This density function is not easy to sample directly. One can use the acceptance-rejec- 
tion method in Chapter 6 by finding a density q(h t ) and a constant c such 
that p(h t ) < cq(h T ) for simulating the conditional posterior distribution. The 
Metropolis-Hastings algorithm provides an alternative way to sample this density 
easily. Let X t be a Markov chain with transition density as the normal density 
q{x t ) oc exp{— (x t — a h ) 2 /2(S h ] so that it does not depend on any of the previous 
states. Specifically, simulate a random variable x t from q(x), accept x t as h]' 1 with 
probability 


min 


P(x t \i4,T,y,h_ t )q(h\‘ 

: 1 

p (h^ l) \p,r,y,h_ t ) q(x t ) 

V 7 


Otherwise, set and move to sample 

Different choices of q{x) can lead to different efficiency of the algorithm. Interested 
readers may refer to Jacquier et al. ( 1 994) for details. In practice, the log-return can be 
adjusted to have a mean of zero if we minus each of y r by the mean u = ( £" =1 y t ) In. 
Then, the mean-corrected returns y t = y t — u can be applied directly in this simple 
SV model. Some software packages, such as WINBUS, can perform the sampling 
conveniently for users. 


12.6 EXERCISES 

1. Suppose that Aj , ... ,X n are independent observations that follow N(//, a 2 ), where 
p is a known quantity. 

(a) Show that the likelihood function Lio 1 ) satisfies 

LU t 2 ) oc 0 2 r" /2 exp ^(X ; - - p) 2 

(b) Suppose further that a 2 ~ 7G(a, p). What is the conditional distribution of 
a 2 \X { ,...,X n l 

Hint: Denote p(4>) as the density of the inverse Gamma distribution. Then we 
have 

p((l>) oc 

2. A density function with a single parameter, p{x\6), is said to be of the exponential 
family if it takes the form 

P(x\0) = g(x)h(6)ex p t(x)ip(9) . 

Show that a normal mean with a known variance, normal variance with a know 
mean, a Poisson distribution, and a binomial distribution are of the exponential 
family. 
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3 . Show that if the likelihood function is from the exponential family and the prior 
distribution is from the exponential family, then the posterior distribution also 
belongs to the exponential family. 

4 . Simulate the daily jump-diffusion VaR (value-at-risk) of the Dow Jones Industrial 
Index on the basis of the data used in Section 7.2.3. Compare your value with the 
GED-VaR (generalized error distribution-value-at-risk) defined in Chapter 7. 

5. Suppose that x\p ~ Bin{n,p) and p\x ~ Be(x + a, n — x + /J), where n is a Poisson 
variable of mean X. Use Gibbs sampling to find the unconditional distribution of 
n where X = 16, a = 2 and f) = 4. 

6. Consider the normal distribution with an unknown mean p and a known variance. 

(a) Assume that the prior of p is a discrete mixture of two normal densities. Show 
that this prior is still conjugate. 

(b) Assume that the prior of p is a discrete mixture of k normal densities. Is the 
prior still conjugate? 

7 . Consider the following transition matrix of a Markov chain: 



1 

2 

3 

4 

1 

1/6 

0 

1/2 

1/3 

2 

0 

1/3 

1/3 

1/3 

3 

0 

1/2 

0 

1/2 

4 

1/4 

1/4 

1/4 

1/4 


Use the Metropolis-Hastings algorithm to construct a Markov chain whose limit- 
ing distribution is (1 /6, 1 /6, 1 /3, 1/3) based on the aforementioned matrix. 

8. Consider the transition matrix of another Markov chain: 



1 

2 

3 

4 

1 

1/2 

0 

1/2 

0 

2 

2/3 

0 

1/6 

1/6 

3 

0 

1/3 

0 

2/3 

4 

1/4 

1/4 

1/4 

1/4 


Use the Metropolis-Hastings algorithm to construct a Markov chain whose limit- 
ing distribution is (1/10, 2/10, 3/10,4/10) based on the aforementioned matrix. 

9. Modify the online VBA supplementary codes from Example 12.6 to generate the 
GED (see Section 7.2.2 for the details of this distribution). By choosing £ = 1.6, 
compare the shape of the generated distribution to that of the standard normal 
distribution (which corresponds to £ = 2 in the GED distribution) and the double 
exponential distribution (which corresponds to | = 1 in the GED distribution). 

The solutions and/or additional exercises are available online at http://www.sta. 
cuhk.edu. hk/Book/SRMS/. 
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